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DESCRIPTION 

Development of Novel Anti-Microbial Agents Based on Bacteriophage Genomics 

5 BACKGROUND OF THE INVENTION 

The present invention relates to the field of antibacterial agents and the 
treatment of infections of animals or other complex organisms by bacteria. 

10 The frequency and spectrum of antibiotic-resistant infections have, in recent 

years, increased in both the hospital and community. Certain infections have become 
essentially unbeatable and are growing to epidemic proportions in the developing 
world as well as in institutional settings in the developed world. The staggering 
spread of antibiotic resistance in pathogenic bacteria has been attributed to microbial 

1 5 genetic characteristics, widespread use of antibiotic drugs, and changes in society that 
enhance the transmission of drug-resistant organisms. This spread of drug resistant 
microbes is leading to ever increasing morbidity, mortality and health-care costs. 

Ironically, it is the very success of antibiotics, resulting in their widespread 
use, that has contributed the most to rising numbers of drug resistant bacterial strains. 

20 The longer a bacterial strain is exposed to a drug, the more likely it is to acquire 
resistance. Today, a total of 160 antibiotics, all based on a few basic chemical 
structures and targeting a small number of metabolic pathways, have found their way 
to market. Over-prescription of these drugs, as well as the failure of patients to 
comply with the complete antibiotic regimen, has lead to the rapid emergence of 

25 antibiotic resistant strains. Such misuse of prescriptions, careless use of antibiotics in 
virtually all commercial production of beef and fowl, and changing societal 
conditions, such as the growth of day-care centers, increased long-term care in 
hospitals, and increased mobility of the population, has provided an environment 
where drug-resistant microbes can emerge and spread. Thus, virtually all common 

30 infectious bacteria are becoming, or have already become, resistant to one or more 
groups of antibiotics. Such resistance now reaches all classes of antibiotics currently 
in use, including: P-lactams, fluoroquinolones, aminoglycosides, macrolide peptides, 
chloramphenicol, tetracyclines, rifampicin, folate inhibitors, glycopeptides, and ~ 
mupirocin. 

35 Over the last 45 years bacteria have adapted genetically to avoid the 

destruction/alteration of the essential pathways that these chemotherapeutic agents 
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target. Antibiotic resistant bacterial strains are now emerging at a higher rate than the 
rate at which new antibiotics are being developed. The consequence of this dilemma 
has been a dramatic increase in the cost of treating infections what would otherwise 
easily succumb to routine antibiotic therapy. Furthermore, and perhaps most 
5 importantly, the emergence of multiple drug resistant pathogenic bacteria has led to a 
significant increase in morbidity and mortality, particularly in institutional settings. 

Most major pharmaceutical companies have on-going drug discovery 
programs for novel anti-microbials. These are based on screens for small molecule 
inhibitors (natural products, bacterial culture media, libraries of small molecules, 

10 combinatorial chemistry) of crucial metabolic pathways of the micro-organism of 
interest (e.g., bacteria, fungi, parasites, worms). The screening process is largely for 
cytotoxic compounds and in most cases is not based on a known mechanism of action 
of the compounds. Pharmaceutical companies have large programs in this area. 
Classical drug screening programs are being exhausted and many of these 

15 pharmaceutical companies are looking towards rational drug design programs. 

Several small to mid-size biotechnology companies as well as large 
pharmaceutical companies have developed systematic high-throughput sequencing 
programs to decipher the genetic code of specific micro-organisms of interest. The 
goal is to identify, through sequencing, unique biochemical pathways or intermediates 

20 that are unique to the microorganism. Knowledge of this may, in turn, form the 
rationale for a drug discovery program based on the mechanism of action of the 
identified enzymes/proteins. Genome Therapeutics Corp., The Institute for Genome 
Research, Human Genome Sciences Inc., and other companies have such sequencing 
programs in place. However, one of the most critical steps in this approach is the 

25 ascertainment that the identified proteins and biochemical pathways are 1) non- 
redundant and essential for bacterial survival, and 2) constitute suitable and accessible 
targets for drug discovery. 
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SUMMARY OF THE INVENTION 

While animals such as humans are, on occasion, infected by pathogenic 
bacteria, bacteria also have natural enemies. A number of host-specific viruses, 
known as bacteriophages or phages, infect and kill bacteria in the natural 
environment. Such bacteriophages generally have small compact genomes and 
bacteria are their exclusive hosts. Many known bacteria are host to a large number of 
bacteriophages that have been described in the literature. During the 1940's - 1960's, 
phage biology was an area of active research. As a testimony to this, the study of 
phages which infect and inhibit the enteric bacterium Escherichia coli (E. coli) 
contributed much to the early understanding of molecular biology and virology. 

As is generally understood, bacteriophage (or phages) are viruses that infect 
and kill bacteria. They are natural enemies of bacteria and, over the course of 
evolution, have developed proteins (products of DNA sequences) which enable them 
to infect a host bacteria, replicate their genetic material, usurp host metabolism, and 
ultimately kill their host. The scientific literature well documents the fact that many 
known bacteria have a large number of such bacteriophages (Ackermann and DuBow, 
1987) that can infect and kill them (for example, see the ATCC bacteriophage 
collection at http://www.atcc.org). 

This invention utilizes the observation that bacteriophages successfully infect 
and inhibit or kill host bacteria, targeting a variety of normal host metabolic and 
physiological traits, some of which are shared by all bacteria, pathogenic and 
nonpathogenic alike. The term "pathogenic" as used herein denotes a contribution to 
or implication in disease or a morbid state of an infected organism. The invention 
thus involves identifying and elucidating the molecular mechanisms by which phages 
interfere with host bacterial metabolism, an objective being to provide novel targets 
for drug design. Whether the phage blocks bacterial RNA transcription or translation, 
or attacks other important metabolic pathways, such as cell wall assembly or 
membrane integrity, the basic blueprint for a phage's bacteria-inhibiting ability is 
encoded in its genome and can be unlocked using bioinformatics, functional 
genomics, and proteomics. By these means, the invention utilizes sequence 
information from the genomics of bacteriophage to identify novel antimicrobials that 
can be further used to actively and/or prophylactically treat bacterial infection. 

Two important components of the invention thus are: i) the identification of 
bacteria-inhibiting phage open reading frames ("ORF's) and corresponding products 
that can be used to develop antibiotics based on amino acid sequence and secondary 
structural characteristics of the ORF products, and ii) the use of bacteriophages to map 
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out essential bacterial target genes and homologs, which can in turn lead to the 
development of suitable anti-microbial agents. These two avenues represent new and 
general methods for developing novel antimicrobials. 

The invention thus concerns the identification of bacteriophage ORFs that 
5 supply bacteria-inhibiting functions. In this regard, use of the terms "inhibit", 
"inhibition", "inhibitory", and "inhibitor" all refer to a function of reducing a 
biological activity or function. Such reduction in activity or function can, for 
example, be in connection with a cellular component, e.g., an enzyme, or in 
connection with a cellular process, e.g., synthesis of a particular protein, or in 

10 connection with an overall process of a cell, e.g., cell growth. In reference to bacterial 
cell growth, for example, an inhibitory effect (i.e., a bacteria-inhibiting effect) may be 
bacteriocidal (killing of bacterial cells) or bacteriostatic (i.e., stopping or at least 
slowing bacterial cell growth). The latter slows or prevents cell growth such that 
fewer cells of the strain are produced relative to uninhibited cells over a given period 

15 of time. From a molecular standpoint, such inhibition may equate with a reduction in 
the level of, or elimination of, the transcription and/or translation of a specific 
bacterial target(s), or reduction or elimination of activity of a particular target 
biomolecule. 

It is particularly advantageous to evaluate a plurality of different phage ORFs 

20 for inhibitory activity that may be from one, but is preferably from a plurality of 

different phage. For example, evaluating ORFs from a number of different phage of 
the same bacterial host provides at least two advantages. One is that the multiple 
phages will provide identification of a variety of different targets. Second, it is likely 
that multiple phage will utilize the same cellular target 

25 As used herein, the terms "bacteriophage" and "phage" are used 

interchangeably to refer to a virus which can infect a bacterial strain or a number of 
different bacterial strains. 

In the context of this invention, the term "bacteriophage ORF" or ""phage 
ORF" or similar term refers to a nucleotide sequence in or from a bacteriophage. In 

30 connection with a particular ORF, the terms refer an open reading frame which has at 
least 95% sequence identity, preferably at least 97% sequence identity, more 
preferably at least 98% sequence identity with an ORF from the particular phage 
identified herein (e.g., with an ORF as identified herein) or to a nucleic acid sequence 
which has the specified sequence identify percentage with such an ORF sequence, 

35 A first aspect of the invention thus provides a method for identifying a ^ 

bacteriophage nucleic acid coding region encoding a product active on an essential 
bacterial target by identifying a nucleic acid sequence encoding a gene product which 
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provides a bacteria-inhibiting function when the bacteriophage infects a host 
bacterium, preferably one that is an animal or plant pathogen, more preferably a bird 
or mammalian pathogen, and most preferably a human pathogen. The bacteriophage 
is an uncharacterized bacteriophage. Thus, the method excludes, for example, phage 
5 X y <|>xl74, ml3 and other £.co//-specific bacteriophage that have been studied with 
respect to gene number and/or function. It also excludes, for example, the nucleic 
acid coding regions described in Tables 12-14, and in preferred embodiments, 
excludes the phage in which those regions are naturally located. 

In connection with bacteriophage, the term "uncharacterized" means that a 

1 0 certain bacteriophage's genome has not yet been fully identified such that the genes 
having function involved in inhibiting host cells have not been identified. In 
particular, phage for which the description of genomic or protein sequence was first 
provided herein are uncharacterized. Phage sequences for which host bacteria- 
inhibiting functions have been identified prior to the filing of the present application 

1 5 (or alternatively prior to the present invention) are specifically excluded from the 

aspects involving utilization of sequences from uncharacterized bacteriophage, except 
that aspects may involve a plurality of phage where one or more of those phage are 
uncharacterized and one or more others have been characterized to some extent. A 
number of different bacteria-inhibiting phage ORFs are indicated in Tables 1 1-14. 

20 The phage ORFs or sequences identified therein are not within the term 

"uncharacterized; alternatively, in preferred embodiments the phage containing those 
ORFs are excluded from this term. Further, any additional phage ORFs (or 
alternatively the phage which contain those ORFs) which have previously been 
described in the art as bacteria-inhibiting ORFs are expressly excluded; those ORFs or 

25 phage are known to those skilled in the art and the exclusion can be made express by 
specifically naming such ORFs or phage as needed (likewise for uncharacterized 
targets as described below). For the sake of brevity, such a listing is not expressly 
presented, as such information is readily available to those skilled in the art. 

Stating that an agent or compound is "active on" a particular cellular target, 

30 such as the product of a particular gene, means that the target is an important part of a 

cellular pathway which includes that target and that the agent acts on that pathway. 

Thus, in some cases the agent may act on a component upstream or downstream of the 

stated target, including on a regulator of that pathway or a component of that pathway. 
By "essential", in connection with a gene or gene product, is meant that the host 

35 cannot survive without, or is significantly growth compromised, in the absence 

depletion, or alteration of functional product. An "essential gene" is thus one that 

encodes a product that is beneficial, or preferably necessary, for cellular growth in 
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vitro in a medium appropriate for growth of a strain having a wild-type allele 
corresponding to the particular gene in question. Therefore, if an essential gene is 
inactivated or inhibited, that cell will grow significantly more slowly, preferably less 
than 20%, more preferably less than 10%, most preferably less than 5% of the growth 
5 rate of the uninhibited wild-type, or not at all, in the growth medium. Preferably, in 
the absence of activity provided by a product of the gene, the cell will not grow at all 
or will be non-viable, at least under culture conditions similar to the in vivo conditions 
normally encountered by the bacterial cell during an infection. For example, absence 
of the biological activity of certain enzymes involved in bacterial cell wall synthesis 
10 can result in the lysis of cells under normal osmotic conditions, even though 
protoplasts can be maintained under controlled osmotic conditions. In the context of 
the invention, essential genes are generally the preferred targets of antimicrobial 
agents. Essential genes can encode target molecules directly or can encode a product 
involved in the production, modification, or maintenance of a target molecule. 

1 5 A "target" refers to a biomolecule that can be acted on by an exogenous agent, 

thereby modulating, preferably inhibiting, growth or viability of a cell. In most cases 
such a target will be a nucleic acid sequence or molecule, or a polypeptide or protein. 
However, other types of biomolecules can also be targets, e.g., membrane lipids and 
cell wall structural components. 

20 The term "bacterium" refers to a single bacterial strain, and includes a single 

cell, and a plurality or population of cells of that strain unless clearly indicated to the 
contrary. In reference to bacteria or bacteriophage, the term "strain" refers to bacteria 
or phage having a particular genetic content. The genetic content includes genomic 
content as well as recombinant vectors. Thus, for example, two otherwise identical 

25 bacterial cells would represent different strains if each contained a vector, e.g., a 
plasmid, with different phage ORF inserts. 

In preferred embodiments, the phage is Staphylococcus aureus phage 77, 3 A, 
96, or 44 AHJD, Enterococcus sp. phage 1 82, or Streptococcus pneumoniae phage 
Dp-1. 

30 In preferred embodiments, the phage is selected from. Preferred embodiments 

involve expressing at least one recombinant phage ORF(s) in a bacterial host followed 
by inhibition analysis of that host. Inhibition following expression of the phage ORF\ , 
is indicative that the product of the ORF is active on an essential bacterial target. 
Such evaluation can be carried out in a variety of different formats, such as on a 

35 support matrix such as a solidified medium in a petri dish, or in liquid culture. 
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Preferably a plurality of phage ORFs are expressed in at least one bacterium. The 
plurality of phage ORFs can be from one or a plurality of phage. With respect to a 
single phage or at least one phage in a plurality of phages, the plurality of expressed 
ORFs preferably represents at least 10%, more preferably at least 20%, 40%, or 60%, 
5 still more preferably at least 80% or 90%, and most preferably at least 95% of the 
ORFs in the phage genome. Preferably, for a plurality of phage, the plurality of 
expressed ORFs preferably represents at least 10%, more preferably at least 20%, 
40%, or 60%, still more preferably at least 80% or 90%, and most preferably at least 
95% of the ORFs in the phage genome of each phage. The plurality of phage ORFs 
10 can be expressed in a single bacterium, or in a plurality of bacteria where one ORF is 
expressed in each bacterium, or in a plurality of bacteria where a plurality of ORFs are 
expressed in at least one or in all of the plurality of bacteria, or combinations of these. 

In embodiments of the above aspect (as well as in other aspects herein) in 
which a plurality of phage are utilized, a plurality of phage have the same bacterial 

15 host species; have different bacterial host species; or both. The plurality of phage 
includes at least two different phage, preferably at least 3,4,5,6,8,10,15,20, or more 
different phage. Indeed, more preferably, the plurality of phage will include 50, 75, 
100, or more phage. As described herein, the larger number of phage is useful to 
provide additional target and target evaluation information useful in developing 

20 antibacterial agents, for example, by providing identification of a larger range of 
bacterial targets, and/or providing further indication of the suitability of a particular 
target (for example, utilization of a target by a number of different unrelated phage 
can suggest that the target is particularly stable and accessible and effective) and/or 
can indicate alternate sites on a target which interact with different inhibitors. 

25 Further embodiments involve confirmation of the inhibitor function of the 

phage ORF, such as by utilizing or incorporating a control(s) designed to confirm the 
inhibitory nature of the ORF(s) being evaluated. The control can, for example, be 
provided by expression of an inactive or partially inactive form of the ORF or ORF 
product, and/or by the absence of expression of the ORF or ORF product in the same 

30 or a closely comparable bacterial strain as that used for expression of the test ORF. 
The reduced level of activity or the absence of active ORF product in the control will 
thus not provide the inhibition provided by a corresponding inhibitory ORF, or will 
provide a distinguishably lower level of inhibition. An inactivated or partially 
inactivated control has a mutation(s), e.g., in the coding region or in flanking 

35 regulatory elements, that reduce(s) or eliminate(s) the normal function of the ORF" 
Thus, the inhibition of a bacterium following expression of a phage ORF is 
determined by comparison with the effects of expression of an inactivated ORF or the 
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response of the bacteria in the absence of expression in the same or similar type 
bacterium. Such determination of inhibition of the bacterium following expression of 
the ORF is indicative of a bacteria-inhibiting function. These manipulations are 
routinely understood and accomplished by those of skill in the art using standard 
techniques. In embodiments utilizing absence of expression of the ORF, the bacteria 
can, for example, contain an empty vector or a vector which allows expression of an 
unrelated sequence which is preferably non-inhibitory. Alternatively, the bacteria 
may have no vector at all. Combinations of such controls or other controls may also 
be utilized as recognized by those skilled in the art. 

In embodiments involving expression of a phage ORF in a bacterial strain, in 
preferred embodiments that expression is inducible. 

By "inducible" is meant that expression is absent or occurs at a low level until 
the occurrence of an appropriate environmental stimulus provides otherwise. For the 
present invention such induction is preferably controlled by an artificial 
environmental change, such as by contacting a bacterial strain population with an 
inducing compound {i.e., an inducer). However, induction could also occur, for 
example, in response to build-up of a compound produced by the bacteria in the 
bacterial culture, e.g., in the medium. As uncontrolled or constitutive expression of 
inhibitory ORFs can severely compromise bacteria to the point of eradication, such 
expression is therefore undesirable in many cases because it would prevent effective 
evaluation of the strain and inhibitor being studied. For example, such uncontrolled 
expression could prevent any growth of the strain following insertion of a 
recombinant ORF, thus preventing determination of effective transfection or 
transformation. A controlled or inducible expression is therefore advantageous and is 
generally provided through the provision of suitable regulatory elements, e.g., 
promoter/operator sequences that can be conveniently transcriptionally linked to a 
coding sequence to be evaluated. In most cases, the vector will also contain 
sequences suitable for efficient replication of the vector in the same or different host 
cells and/or sequences allowing selection of cells containing the vector, i.e., 
"selectable markers." Further, preferred vectors include convenient primer sequences 
flanking the cloning region from which PCR and/or sequencing may be performed. 

As knowledge of the nucleotide sequence of phage ORFs is useful, e.g., for 
assisting in the identification of phage proteins active against essential bacterial host 
targets, preferred embodiments involve the sequencing of at least a portion of the. .. 
phage genome in combination with the above methods. This can be done eitherielore 
or after or independent of expression and inhibition of the ORF in the bacteria, and 
provides information on the nature and characteristics of the ORF. Such a portion is 
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preferably at least 10%, 20%, 40%, 80%, 90%, or 100% of the phage genome. For 
embodiments in which a plurality of phage are utilized, preferably each phage is 
sequenced to an extent as just specified. 

Such sequencing is preferably accompanied by computer sequence analysis to 
5 define and evaluate ORF(s), ORF products, structural motifs or functional properties 
of ORF products, and/or their genetic control elements. Thus, certain embodiments 
incorporate computer sequence analyses or nucleic acid and/or amino acid sequences. 
Further, existing data banks can provide phage sequence and product information 
which can be utilized for analysis and identification of ORFs in the sequence. 
10 Computer analysis may further employ known homologous sequences from other 
species that suggest or indicate conserved underlying biochemical function(s) for the 
inhibitory or potentially inhibitory ORF sequence(s) being evaluated. This can 
include the sequences of signature motifs of identified classes of inhibitors. 

In the context of the phage nucleic acid sequences, e.g., gene sequences, of this 

15 invention, the terms "homolog" and "homologous" denote nucleotide sequences from 
different bacteria or phage strains or species or from other types of organisms that 
have significantly related nucleotide sequences, and consequently significantly related 
encoded gene products, preferably having related function. Homologous gene 
sequences or coding sequences have at least 70% sequence identity (as defined by the 

20 maximal base match in a computer-generated alignment of two or more nucleic acid 
sequences) over at least one sequence window of 48 nucleotides, more preferably at 
least 80 or 85%, still more preferably at least 90%, and most preferably at least 95%. 
The polypeptide products of homologous genes have at least 35% amino acid 
sequence identity over at least one sequence window of 18 amino acid residues, more 

25 preferably at least 40%, still more preferably at least 50% or 60%, and most 
preferably at least 70%, 80%, or 90%. Preferably, the homologous gene product is 
also a functional homolog, meaning that the homolog will functionally complement 
one or more biological activities of the product being compared. For nucleotide or 
amino acid sequence comparisons where a homology is defined by a % sequence 

30 identity, the percentage is determined using BLAST programs ( with default 
parameters (Altschul et ah, 1997, "Gapped BLAST and PSI-BLAST: a new 
generation of protein database search programs, Nucleic Acid Res. 25:3389-3402). 
Any of a variety of algorithms known in the art which provide comparable results can 
also be used, preferably using default parameters. Performance characteristics for . , 

35 three different algorithms in homology searching is described in Salamov et aU f559, 
"Combining sensitive database searches with multiple intermediates to detect distant 
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homologues." Protein Eng. 12:95-100. Another exemplary program package is the 
GCG™ package from the University of Wisconsin. 

Homologs may also or in addition be characterized by the ability of two 
complementary nucleic acid strands to hybridize to each other under appropriately 
5 stringent conditions. Hybridizations are typically and preferably conducted with 
probe-length nucleic acid molecules, preferably 20-100 nucleotides in length. Those 
skilled in the art understand how to estimate and adjust the stringency of hybridization 
conditions such that sequences having at least a desired level of complementarity will 
stably hybridize, while those having lower complementarity will not. For examples of 
10 hybridization conditions and parameters, see, e.g. 9 . Maniatis, T. et al. (1989) 

Molecular Cloning: A Laboratory Manual. Cold Spring Harbor University Press, Cold 
Spring, N.Y.; Ausubel, F.M. et al, (1994) Current Protocols in Molecular Biology . 
John Wiley & Sons, Secaucus, NJ. Homologs and homologous gene sequences may 
thus be identified using any nucleic acid sequence of interest, including the phage 

15 ORFs and bacterial target genes of the present invention. 

A typical hybridization, for example, utilizes, besides the labeled probe of 
interest, a salt solution such as 6xSSC (NaCl and Sodium Citrate base) to stabilize 
nucleic acid strand interaction, a mild detergent such as 0.5% SDS, together with 
other typical additives such as Denhardt's solution and salmon sperm DNA. The 

20 solution is added to the immobilized sequence to be probed and incubated at suitable 
temperatures to preferably permit specific binding while minimizing nonspecific 
binding. The temperature of the incubations and ensuing washes is critical to the 
success and clarity of the hybridization. Stringent conditions employ relatively higher 
temperatures, lower salt concentrations, and/or more detergent than do non-stringent 

25 conditions. Hybridization temperatures also depend on the length, complementarity 
level, and nature (ie, "GC content'*) of the sequences to be tested. Typical stringent 
hybridizations and washes are conducted at temperatures of at least 40°C, while lower 
stringency hybridizations and washes are typically conducted at 37°C down to room 
temperature (~25°C). One of skill in the art is aware that these conditions may vary 

30 according to the parameters indicated above, and that certain additives such as 
formamide and dextran sulphate may also be added to affect the conditions. 

By "stringent hybridization conditions" is meant hybridization conditions at 
least as stringent as the following: hybridization in 50% formamide, 5X SSC, 50 mM 
NaHjP0 4 , pH 6.8, 0.5% SDS, 0.1 mg/mL sonicated salmon sperm DNA, and 5X 

35 Denhart's solution at 42°C overnight; washing with 2X SSC, 0.1% SDS at 45°G; Sid 
washing with 0.2X SSC, 0.1% SDS at 45°C. 
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In sequence comparison analyses, an ORF, or motif, or set of motifs in a 
bacteriophage sequence can be compared to known inhibitor sequences, e.g., 
homologous sequences encoding homologous inhibitors of bacterial function. 
Likewise, the analysis can include comparison with the structure of essential bacterial 
5 gene products, as structural similarities can be indicative of similar or replacement 
biological function. Such analysis can include the identification of a signature, or 
characteristic motifs) of an inhibitor or inhibitor class. 

Also, the identification of structural motifs in an encoded product, based on 
nucleotide or amino acid sequence analysis, can be used to infer a biochemical 

1 0 function for the product. A database containing identified structural motifs in a large 
number of sequences is available for identification of motifs in phage sequences. The 
database is PROSITE, which is available at www.expasy.ch/cgi-bin/scanprosite. The 
identification of motifs can, for example, include the identification of signature motifs 
for a class or classes of inhibitory proteins. Other such databases may also be used. 

15 In aspects and preferred embodiments described herein, in which a bacterium 

or host bacterium is specified, the bacterium or host bacterium is preferably selected 
from a pathogenic bacterial species, for example, one selected from Table 1. 
Preferably, an animal or plant pathogen is used. For animals, preferably the bacterium 
is a bird or mammalian pathogen, still more preferably a human pathogen. 

20 In aspects and preferred embodiments involving a bacteriophage or sequences 

from a bacteriophage, one or more bacteriophage are preferably selected from those 
listed in Table 1 . Those exemplary bacteriophge are readily obtained from the 
indicated sources. 

In some cases, it is advantageous to utilize phage with non-pathogenic host 
25 bacteria. The genome, structural motif, ORF, homolog, and other analyses described 
herein can be performed on such phage and bacteria. Such analysis provides useful 
information and compositions. The results of such analyses can also be utilized in 
aspects of the present invention to identify homologous ORFs, especially inhibitor 
ORFs in phage with pathogenic bacterial hosts. Similarly, identification of a target in 
30 a non-pathogenic host can be used to identify homologous sequences and targets in 
pathogenic bacteria, especially in genetically closely related bacteria. Those skilled in 
the art are familiar with bacterial genetic relationships and with how to determine 
relatedness based on levels of genomic identity or other measures of nucleotide 
sequence and/or amino acid sequence similarity, and/or other physical and culture. 
35 characteristics such as morphology, nutritional requirements, or minimal media-to 
support growth. 
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Also in preferred embodiments, an embodiments of this aspect is combined 
with an embodiment of the following aspect. 

A related aspect of the invention provides methods for identifying a target for 
antibacterial agents by identifying the bacterial target(s) of at least one 
5 uncharacterized or untargeted inhibitor protein or RNA from a bacteriophage. Such 
identification allows the development of antibacterial agents active on such targets. 
Preferred embodiments for identifying such targets involve the identification of 
binding of target and phage ORF products to one another. The phage ORF products 
may be subportions of a larger ORF product that also binds the host target. In 
10 preferred embodiments, the phage protein or RNA is from an uncharacterized 
bacteriophage in Table 1 . This aspect preferably includes the identification of a 
plurality of such targets in one or a plurality of different bacteria, preferably in one or 
a plurality of bacteria listed in Table 1 . 

In preferred embodiments of this aspect and other aspects of this invention 
15 involving particular phage ORFs or phage sequences, the ORF is Staphylococcus 
aureus phage 77 ORF 17, 19, 43, 102, 104, or 182 as identified in U.S. application 
09/407,804, S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae 
phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or 
Enterococcus sp. phage 182 ORF 002, 008, or 014. 
20 As indicated for the above aspect, preferably the method involves the use of a 

plurality of different phage, and thus a plurality of different phage inhibitors and/or 
inhibitor ORFs. 

In addition to uncharacteized phage ORF products, it is also useful to identify 
the targets of phage ORF products which are known to be inhibitors of host bacteria, 
25 but where the target has not been identified. Thus, such inhibitors can likewise be 
utilized as "untargeted" inhibitor phage ORFs and ORF products, e.g., proteins or 
RNAs. 

In the context of inhibitor proteins or RNAs from a phage, the term 
"uncharacterized" means that a bacteria-inhibiting function for the protein has not 

30 previously been identified. Preferably, but not necessarily, the sequence of the protein 
or the corresponding coding region or ORF was not described in the art before the 
filing of the present application for patent (or alternatively prior to the present 
invention). Thus, this term specifically excludes any bacteria-inhibiting phage protein 
and its associated bacterial target which has been identified as inhibitory before the 

35 present invention or alternatively before the filing of the present application, fou 

example those identified in Tables 12-14 or otherwise identified herein. For example, 
from E. coli y phage T7 genes 0.7 and 2.0 target the host RNA polymerase, phage T4 
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gp55/gp33 alter the specificity of host RNA polymerase. The T4 regB gene product 
also targets the host translation apparatus. As with the uncharacterized bacteriophage 
ORFs or bacteriophage above, for such identified proteins, the sequences encoding 
those proteins are excluded from the uncharacterized inhibitor proteins. 
5 The term "fragment" refers to a portion of a larger molecule or assembly. For 

proteins, the term "fragment" refers to a molecule which includes at least 5 contiguous 
amino acids from the reference polypeptide or protein, preferably at least 8, 10, 12, 
15, 20, 30, 50 or more contiguous amino acids. In connection with oligo- or 
polynucleotides, the term "fragment" refers to a molecule which includes at least 15 

10 contiguous nucleotides from a reference polynucleotide, preferably at least 24, 30, 36, 
45, 60, 90, 150, or more contiguous nucleotides. 

Preferred embodiments involve identification of binding that include methods 
for distinguishing bound molecules, for example, affinity chromatography, 
immunoprecipitation, crosslinking, and/or genetic screen methods that permit 

15 proteinrprotein interactions to be monitored. One of skill in the art is familiar with 
these techniques and common materials utilized (see, e.g., Coligan, J. et al. (eds.) 
(1995) Current Protocols in Protein Science. John Wiley & Sons, Secaucus, N.J.). 

Genetic screening for the identification of proteinrprotein interactions typically 
involves the co-introduction of both a chimeric bait nucleic acid sequence (here, the 

20 phage ORF to be tested) and a chimeric target nucleic acid sequence that, when co- 
expressed and having affinity for one another in a host cell, stimulate reporter gene 
expression to indicate the relationship. A "positive" can thus suggest a potential 
inhibitory effect in bacteria. This is discussed in further detail in the Detailed 
Description section below. In this way, new bacterial targets can be identified that are 

25 inhibited by specific phage ORF products or derivatives, fragments, mimetics, or 
other molecules. 

Other embodiments involve the identification and/or utilization of mutant 
targets by virtue of their host's relatively unresponsive nature in the presence of 
expression of ORFs previously identified as inhibitory to the non-mutant or wild-type 

30 strain. Such mutants have the effect of protecting the host from an inhibition that 
would otherwise occur and indirectly allow identification of the precise responsible 
target for follow-up studies and anti-microbial development. In certain embodiments, 
rescue from inhibition occurs under conditions in which a bacterial target or mutant 
target is highly expressed. This is performed, for example, through coupling of the 

35 sequence with regulatory element promoters, e.g., as known in the art, which regulate 
expression at levels higher than wild-type, e.g., at a level sufficiently higher that the 
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inhibitor can be competitively bound to the highly expressed target such that the 
bacterium is detectably less inhibited. 

Identification of the bacterial target can involve identification of a phage- 
specific site of action. This can involve a newly identified target, or a target where the 
phage site of action differs from the site of action of a previously known antibacterial 
agent or inhibitor. For example, phage T7 genes 0.7 and 2.0 target'the host RNA 
polymerase, which is also the cellular target for the antibacterial agent, rifampin. To 
the extent that a phage product is found to act at a different site than previously 
described inhibitors, aspects of the present invention can utilize those new, phage- 
specific sites for identification and use of new agents. The site of action can be 
identified by techniques well-known to those skilled in the art, for example, by 
mutational analysis, binding competition analysis, and/or other appropriate 
techniques. 

Once a bacterial host target protein or nucleic acid or mutant target sequence 
has been identified and/or isolated, it too can be conveniently sequenced, sequence 
analyzed (e.g., by computer), and the underlying gene(s), and corresponding translated 
product(s) further characterized. Preferred embodiments include such analysis and 
identification. Preferably such a target has not previously been identified as an 
appropriate target for antibacterial action. 

Certain embodiments include the identification of at least one inhibitory phage 
ORF or ORF product, e.g., as described for the above aspect, and thus are a 
combination of the two aspects. 

Additionally, the invention provides methods for identifying targets for 
antibacterial agents by identifying homologs of a bacterial target e.g., S. aureus, 
Enterococcus faecalis or other Enterococci, and Streptococcus pneumoniae of a 
bacteriophage inhibitory ORF product. Such homologs may be utilized in the various 
aspects and embodiments described herein as describded for the host Enterococcus sp. 
for bacteriophage 182. 

Other aspects of the invention provide isolated, purified, or enriched specific 
phage nucleic acid and amino acid sequences, subsequences, and homologs thereof for 
phage selected from uncharacterized phage listed in Table 1, preferably from 
bacteriophage 77, 3 A, 96, 44AHJD {Staphylococcus aureus host bacterium), Dp-1 
{Streptococcus pneumoniae host), or 182 {Enterococcus host) or other phage listed in 
Table 1 for those bacteria. For example, such sequences do not include sequences , T 
identified in any of Tables 1 1-14. Nucleotide sequences of this aspect are at leasfl5 
nucleotides in length, preferably at least 18, 21, 24, or 27 nucleotides in length, more 
preferably at least 30, 50, or 90 nucleotides in length. In certain embodiments, longer 
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nucleic acids are preferred, for example those of at least 120, 150, 200, 300, 600, 900 
or more nucleotides. Such sequences can, for example, be amplification 
oligonucleotides (e.g., PCR primers), oligonucleotide probes, sequences encoding a 
portion or all of a phage-encoded protein, or a fragment or all of a phage-encoded 
protein. In preferred embodiments, the nucleic acid sequence contains a sequence 
which is within a length range with a lower length as specified above, and an upper 
length limit which is no more than 50, 60, 70, 80, or 90% of the length of the 
corresponding full-length ORF. The upper length limit can also be expressed in terms 
of the number of base pairs of the ORF (coding region). In preferred embodiments, 
the nucleic acid sequence is from Staphylococcus aureus phage 77 ORF 17, 19, 43, 
102, 104, or 182 as identified in U.S. application 09/407,804, S. aureus phage 44 
AHJD ORF 1, 9, or 12, Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 
008, 010, 013, 016, 021, 029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 
002, 008, or 014. 

As it is recognized that alternate codons will encode the same amino acid for 
most amino acids due to the degeneracy of the genetic code, the sequences of this 
aspect includes nucleic acid sequences utilizing such alternate codon usage for one or 
more codons of a coding sequence. For example, all four nucleic acid sequences 
GCT, GCC, GCA, and GCG encode the amino acid, alanine. Therefore, if for an 
amino acid there exists an average of three codons, a polypeptide of 100 amino acids 
in length will, on average, be encoded by 3 100 , or 5 x 10 47 , nucleic acid sequences. 
Thus, a nucleic acid sequence can be modified (e.g., a nucleic acid sequence from a 
phage as specified above) to form a second nucleic acid sequence encoding the same 
polypeptide as encoded by the first nucleic acid sequence using routine procedures 
and without undue experimentation. Thus, all possible nucleic acid sequences that 
encode the specified amino acid sequences are also fully described herein, as if all 
were written out in full, taking into account the codon usage, especially that preferred 
in the host bacterium. The alternate codon descriptions are available in common 
texbooks, for example, Stryer, BIOCHEMISTRY 3 rd ed., and Lehninger, 
BIOCHEMISTRY 3* ed., along wth many others. Codon preference tables for 
various types of organisms are available in the literature. Sequences with alternate 
codons at one or more sites can also be utilized in the computer-related aspects and 
embodiments herein. Because of the number of sequence variations involving 
alternate codon usage, for the sake of brevity, individual sequences are not separately 
listed herein. Instead the alternate sequences are described by reference to the natural 
sequence with replacement of one or more (up to all e.g., up to 3, 5, 10, 15, 20, 30, 40, 
50, or more) of the degenerate codons with alternate codons from the alternate codon 
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table (Table 6), or a modified table applicable to a particular organism that has 
differing codon usage, preferably with selection according to preferred codon usage 
for the normal host organism or a host organism in which a sequence is intended to be 
expressed. Those skilled in the art also understand how to alter the alternate codons to 
be used for expression in organisms where certain codons code differently than shown 
in the "universal" codon table. 

For amino acid sequences or polypeptides, sequences contain at least 5 peptide- 
linked amino acid residues, and preferably at least 6, 7, 10, 15, 20, 30, or 40, amino 
acids having identical amino acid sequence as the same number of contiguous amino 
acid residues in a particular phage ORF product. In some cases longer sequences may 
be preferred, for example, those of at least 50, 60, 70, 80, or 100 amino acids in 
length. In preferred embodiments, the amino acid sequence contains a sequence which 
is within a length range with a lower length as specified above, and an upper length 
limit which is no more than 50, 60, 70, 80, or 90% of the length of the corresponding 
full-length ORF product. The upper length limit can also be expressed in terms of the 
number of amino acid residues of the ORF product. In preferred embodiments, the 
amino acid sequence or polypeptide has bacteria-inhibiting function when expressed 
or otherwise present in a bacterial cell which is a host for the bacteriophage from 
which the sequence was derived. 

By "isolated" in reference to a nucleic acid is meant that a naturally occurring 
sequence has been removed from its normal cellular (e.g., chromosomal) environment 
or is synthesized in a non-natural environment (e.g., artificially synthesized). Thus, 
the sequence may be in a cell-free solution or placed in a different cellular 
environment. The term does not imply that the sequence is the only nucleotide chain 
present, but that it is essentially free (about 90-95% pure at least) of non-nucleotide 
material naturally associated with it, and thus is distinguished from isolated 
chromosomes. 

The term "enriched" means that the specific DNA or RNA sequence 
constitutes a significantly higher fraction (2-5 fold) of the total DNA or RNA present 
in the cells or solution of interest than in normal or diseased cells or in cells from 
which the sequence was originally taken. This could be caused by a person by 
preferential reduction in the amount of other DNA or RNA present, or by a 
preferential increase in the amount of the specific DNA or RNA sequence, or by a 
combination of the two. However, it should be noted that enriched does not imply- 
that there are no other DNA or RNA sequences present, just that the relative ameunf 
of the sequence of interest has been significantly increased. 
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The term "significant" is used to indicate that the level of increase is useful to 
the person making such an increase and an increase relative to other nucleic acids of 
about at least 2-fold, more preferably at least 5- to 10- fold or even more. The term 
- also does not imply that there is no DNA or RNA from other sources. The other 
source DNA may, for example, comprise DNA from a yeast or bacterial genome, or a 
cloning vector such as pUC19. This term distinguishes from naturally occurring 
events, such as viral infection, or tumor type growths, in which the level of one 
mRNA may be naturally increased relative to other species of mRNA. That is, the 
term is meant to cover only those situations in which a person has intervened to 
elevate the proportion of the desired nucleic acid. 

It is also advantageous for some purposes that a nucleotide sequence be in 
purified form. The term "purified" in reference to nucleic acid does not require 
absolute purity (such as a homogeneous preparation). Instead, it represents an 
indication that the sequence is relatively more pure than in the natural environment 
(compared to the natural level, this level should be at least 2-5 fold greater, e.g., in 
terms of mg/mL). Individual clones isolated from a cDNA library may be purified to 
electrophoretic homogeneity. The claimed DNA molecules obtained from these 
clones could be obtained directly from total DNA or from total RNA. The cDNA 
clones are not naturally occurring, but rather are preferably obtained via manipulation 
of a partially purified naturally occurring substance (messenger RNA). The 
construction of a cDNA library from mRNA involves the creation of a synthetic 
substance (cDNA) and pure individual cDNA clones can be isolated from the 
synthetic library by clonal selection of the cells carrying the cDNA library. Thus, the 
process which includes the construction of a cDNA library from mRNA and isolation 
of distinct cDNA clones yields an approximately 10 6 -fold purification of the native 
message. Thus, purification of at least one order of magnitude, preferably two or 
three orders, and more preferably four or five orders of magnitude is expressly 
contemplated. 

The terms "isolated", "enriched", and "purified" as respect nucleic acids, 
above, may similarly be used to denote the relative purity and abundance of 
polypeptides ( multimers of amino acids joined one to another by ct-carboxyl:a-amino 
group (peptide) bonds). These, too, may be stored in, grown in, screened in, and 
selected from libraries using biochemical techniques familiar in the art. Such 
polypeptides may be natural, synthetic or chimeric and may be extracted using any of 
a variety of methods, such as antibody immunoprecipitation, other "tagging" - 
techniques, conventional chromatography and/or electrophoretic methods. Some of 
the above utilize the corresponding nucleic acid sequence. 
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As indicated above, aspects and embodiments of the invention are not limited 

to entire genes and proteins. The invention also provides and utilizes fragments and 

portions thereof, preferably those which are "active" in the inhibitory sense described 

above. Such peptides or oligopeptides and oligo or polynucleotides have preferred 

5 lengths as specified above for nucleic acid and amino acid sequences from phage; 

corresponding recombinant constructs can be made to express the encoded same. 

Also included are homologous sequences and fragments thereof. 

Nucleic acid sequences of the present invention can be isolated using a method 

similar to those described herein or other methods known to those skilled in the art. 

1 0 In addition, such nucleic acid sequences can be chemically synthesized by well- 
known methods. Also, by having particular phage ORFs, e.g., the phage ORFs 
identified herein (e.g., anti-bacterial ORFs of the present invention, portions thereof, 
or oligonucleotides derived therefrom as described), other antimicrobial sequences 
from other bacteriophage sources can be identified and isolated using methods 

1 5 described here or other methods, including methods utilizing nucleic acid 
hybridization and/or computer-based sequence alignment methods. 

The invention also provides bacteriophage antimicrobial DNA segments from 
other phages based on nucleic acids and sequences hybridizing to the presently 
identified inhibitory ORF under high stringency conditions or sequences that are 

20 highly homologous. The bacteriophage segment from a specific phage, e.g., an 

antimicrobial DNA segment, can be used to identify a related segment from another 
unrelated phage based on stringent conditions of hybridization or on being a homolog 
based on nucleic acid and/or amino acid sequence comparisons. As with identified 
inhibitory sequences, such homologous coding sequences and products can be used as 

25 antimicrobials, to construct active portions or derivatives, to construct 
peptidomimetics, and to identify bacterial targets. 

The nucleotide and amino acid sequences identified herein are believed to be 
correct, however, certain sequences may contain a small percentage of errors, e.g., 1- 
5%. In the event that any of the sequences have errors, the corrected sequences can be 

30 readily provided by one skilled in the art using routine methods. For example, the 
nucleotide sequences can be confirmed or corrected by obtaining and culturing the 
relevant phage, and purifying phage genomic nucleic acids: A region or regions M - 
interest can be amplified, e.g., by PCR from the appropriate genomic template7using 
primers based on the described sequence. The amplified regions can then be 

35 sequenced using any of the available methods (e.g. , a dideoxy termination method). 
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This can be done redundantly to provide the corrected sequence or to confirm that the 
described sequence is correct. Alternatively, a particular sequence or sequences can 
be identified and isolated as an insert or inserts in a phage genomic library and 
isolated, amplified, and sequenced by standard methods. Confirmation or correction 
5 of a nucleotide sequence for a phage gene provides an amino acid sequence of the 
encoded product by merely reading off the amino acid sequence according to the 
normal codon relationships and/or expressed in a standard expression system and the 
polypeptide product sequenced by standard techniques. The sequences described 
herein thus provide unique identification of the corresponding genes, coding 

10 sequences, and other sequences, allowing those sequences to be used in the various 
aspects of the present invention. 

In other aspects, the invention provides recombinant vectors and cells 
harboring at least one of the phage ORFs or portion thereof, or bacterial target 
sequences described herein. As understood by those skilled in the art, vectors may be 

15 provided in different forms, including, for example, plasmids, cosmids, and virus- 
based vectors. See, e.g., Maniatis, T. et al. (1989) Molecular Cloning: A Laboratory 
Manual. Cold Spring Harbor University Press, Cold Spring, N.Y.; See also, Ausubel, 
F.M. et al. (eds.) (1994) Current Protoc ols in Molecular Biology. John Wiley & Sons, 
Secaucus, N. J. 

20 In preferred embodiments, the vectors will be expression vectors, preferably 

shuttle vectors that permit cloning, replication, and expression within bacteria. An 
"expression vector" is one having regulatory nucleotide sequences containing 
transcriptional and translational regulatory information that controls expression of the 
nucleotide sequence in a host cell. Preferably the vector is constructed to allow 

25 amplification from vector sequences flanking an insert locus. In certain embodiments, 
the expression vectors may additionally or alternativley support expression, and/or 
replication in animal, plant and/or yeast cells due to the presence of suitable 
regulatory sequences, e.g., promoters, enhancers, 3' stabilizing sequences, primer 
sequences, etc. In preferred embodiments, the promoters are inducible and specific 

30 for the system in which expression is desired, e.g., bacteria, animal, plant, or yeast. 
The vectors may optionally encode a "tag" sequence or sequences to facilitate protein 
purification. Convenient restriction enzyme cloning sites and suitable selective 
markers) are also optionally included. Such selective markers can be, for example, 
antibiotic resistance markers or markers which supply an essential nutritive growth 

35 factor to an otherwise deficient mutant host, e.g., tryptophan, histidine, or leucinFtn 
the Yeast Two-Hybrid systems described below. 
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The term "recombinant vector" relates to a single- or double-stranded circular 
nucleic acid molecule that can be transfected into cells and replicated within or 
independently of a cell genome. A circular double-stranded nucleic acid molecule can 
be cut and thereby linearized upon treatment with appropriate restriction enzymes. An 
5 assortment of nucleic acid vectors, restriction enzymes, and the knowledge of the 
nucleotide sequences cut by restriction enzymes are readily available to those skilled 
in the art. A nucleic acid molecule encoding a desired product can be inserted into a 
vector by cutting the vector with restriction enzymes and ligating the two pieces 
together. Preferably the vector is an expression vector, e.g., a shuttle expression 
1 0 vector as described above. 

By " recombinant cell" is meant a cell possessing introduced or engineered 
nucleic acid sequences, e.g., as described above. The sequence may be in the form of 
or part of a vector or may be integrated into the host cell genome. Preferably the cell 
is a bacterial cell. 

15 In another aspect, the invention also provides methods for identifying and/or 

screening compounds "active on" at least one bacterial target of a bacteriophage 
inhibitor protein or RNA. Preferred embodiments involve contacting such a bacterial 
target or targets (e.g., bacterial target proteins) with a test compound, and determining 
whether the compound binds to or reduces the level of activity of the bacterial target 

20 (e.g., a bacterial target protein). Preferably this is done either in vivo (i.e., in a cell- 
based assay) or in vitro, e.g., in a cell-free system under approximately physiological 
conditions. 

The compounds that can be used may be large or small, synthetic or natural, 
organic or inorganic, proteinaceous or non-proteinaceous. In preferred embodiments, 
25 the compound is a peptidomimetic, as described herein, a bacteriophage inhibitor 
protein or fragment or derivative thereof, preferably an "active portion", or a small 
molecule. 

In preferred embodiments, the bacterial target is a target of a phage ORF 
identified herein, e.g., S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus 

30 pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, 
or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014. 

In particular embodiments, the methods include the identification of bacterial 
targets or the site of action of an inhibitor on a bacterial target as described above or 
otherwise described herein. 

35 In embodiments involving binding assays, preferably binding is to a fragment 

or portion of a bacterial target protein, where the fragment includes less than 90%, 
80%, 70%, 60%, 50%, 40%, or 30% of an intact bacterial target protein. Preferably, 
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the at least one bacterial target includes a plurality of different targets of 
bacteriophage inhibitor proteins, preferably a plurality of different targets. The 
plurality of targets can be in or from a plurality of different bacteria, but preferably is 
from a single bacterial species. 

A "method of screening*' refers to a method for evaluating a relevant activity 
or property of a large plurality of compounds (e.g., a bacteria-inhibiting activity), 
rather than just one or a few compounds. For example, a method of screening can be 
used to conveniently test at least 100, more preferably at least 1000, still more 
preferably at least 10,000, and most preferably at least 100,000 different compounds, 
or even more. 

In the context of this invention, the term "small molecule" refers to 
compounds having molecular mass of less than 2000 Daltons, preferably less than 
1500, still more preferably less than 1000, and most preferably less than 600 Daltons. 
Preferably but not necessarily, a small molecule is not an oligopeptide. 

In a related aspect or in preferred embodiments, the invention provides a 
method of screening for potential antibacterial agents by determining whether any of a 
plurality of compounds, preferably a plurality of small molecules, is active on at least 
one target of a bacteriophage inhibitor protein or RNA. Preferred embodiments 
include those described for the above aspect, including embodiments which involve 
determining whether one or more test compounds bind to or reduce the level of 
activity of a bacterial target, and embodiments which utilize a plurality of different 
targets as described above. 

The identification of bacteria-inhibiting phage ORFs and their encoded 
products also provides a method for identifying an active portion of such an encoded 
product. This also provides a method for identifying a potential antibacterial agent by 
identifying such an active portion of a phage ORF or ORF product. In preferred 
embodiments, the identification of an active portion involves one or more of 
mutational analysis, deletion analysis, or analysis of fragments of such products. The 
method can also include determination of a 3-dimensional structure of an active 
portion, such as by analysis of crystal diffraction patterns. In further embodiments, 
the method involves constructing or synthesizing a peptidomimetic compound, where 
the structure of the peptidomimetic compound corresponds to the structure of the 
active portion. In this context, "corresponds" means that the peptidomimetic 
compound structure has sufficient similarities to the structure of the active portion that 
the peptidomimetic will interact with the same molecule as the phage protein and 
preferably will elicit at least one cellular response in common which relates to the 
inhibition of the cell by the phage protein. 
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In preferred embodiments, the ORF or ORF product is or is derived or 
obtained from S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae 
phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or 
Enterococcus sp. phage 182 ORF 002, 008, or 014 or product thereof. 

The methods for identifying or screening for compounds or agents active on a 
bacterial target of a phage-encoded inhibitor can also involve identification of a 
phage-specific site of action on the target. 

Preferably in the methods for identifying or screening for compounds active 
on such a bacterial target, the target is uncharacterized; the target is from an 
uncharacterized bacterium from Table 1; the site of action is a phage-specfic site of 
action. 

Further embodiments include the identification of inhibitor phage ORFs and 
bacterial targets as in aspects above. 

An "active portion" as used herein denotes an epitope, a catalytic or regulatory 
domain, or a fragment of a bacteriophage inhibitor protein that is responsible for, or a 
significant factor in, bacterial target inhibition. The active portion preferably may be 
removed from its contiguous sequences and, in isolation, still effect inhibition. 

By "mimetic" is meant a compound structurally and functionally related to a 
reference compound that can be natural, synthetic, or chimeric. In terms of the present 
invention, a "peptidomimetic," for example, is a compound that mimics the activity- 
related aspects of the 3-dimensional structure of a peptide or polyeptide in a non- 
peptide compound, for example mimics the structure of a peptide or active portion of 
a phage- or bacterial ORF-encoded polypeptide. 

A related aspect provides a method for inhibiting a bacterial cell by contacting 
the bacterial cell with a compound active on a bacterial target of a bacteriophage 
inhibitor protein or RNA, where the target was uncharacterized. In preferred 
embodiments, the compound is such a protein, or a fragment or derivative thereof; a 
structural mimetic, eg., a peptidomimetic, of such a protein or fragment; a small 
molecule; the contacting is performed in vitro, the contacting is performed in vivo in 
an infected or at risk organism, e.g., an animal such as a mammal or bird, for example, 
a human, or other mammal described herein; the bacterium is selected from a genus 
and/or species listed in Table 1 ; the bacteriophage inhibitor protein is uncharacterized; 
the bacteriophage inhibitor protein is from an uncharacterized phage listed in Table 1 ; 
the phage inhibitor protein is from one of S. aureus phage 44AHJD ORF 1, 9, or 12, 
Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016,4)21", 
029, 030, 038, or 041 , or Enterococcus sp. phage 1 82 ORF 002, 008, or 014. 
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In the context of targets in this invention, the term "uncharacterized" means 
that the target was not recognized as an appropriate target for an antibacterial agent 
prior to the filing of the present application or alternatively prior to the present 
invention. Such lack of recognition can include, for example, situations where the 
5 target and/or a nucleotide sequence encoding the target were unknown, situations 
where the target was known, but where it had not been identified as an appropriate 
target or as an essential cellular component, and situations where the target was 
known as essential but had not been recognized as an appropriate target due to a belief 
that the target would be inaccessible or otherwise that contacting the cell with a 
10 compound active on the target in vitro would be ineffective in cellular inhibition, or 
ineffective in treatment of an infection. Methods described herein utilizing bacterial 
targets, e.g., for inhibiting bacteria or treating bacterial infections, can also utilize 
"uncharacterized target sites", meaning that the target has been previously recognized 
as an appropriate target for an antibacterial agent, but where an agent or inhibitor of 
1 5 the invention is used which acts at a different site than that at which the previously 
utilized antibacterial agent, j.e M a phage-specific site. Preferably the phage-specific 
site has different functional characteristics from the previously utilized site. In the 
context of targets or target sites, the term "phage-specific" indicates that the target or 
site is utilized by at least one bacteriophage as an inhibitory target and is different 
20 from previously identified targets or target sites. 

In the context of this invention, the term "bacteriophage inhibitor protein" 
refers to a protein encoded by a bacteriophage nucleic acid sequence which inhibits 
bacterial function in a host bacterium. Thus, it is a bacteria-inhibiting phage product. 
In the context of this invention, the phrase "contacting the bacterial cell with a 
25 compound active on a bacterial target of a bacteriophage inhibitor protein" or 
equivalent phrases refer to contacting with an isolated, purified, or enriched 
compound or a composition including such a compound, but specifically does not rely 
on contacting the bacterial cell with an intact phage which encodes the compound. 
Preferably no intact phage are involved in the contacting. 
30 Related aspects provide methods for prophylactic or therapeutic treatment of a 

bacterial infection by administering to an infected, challenged or at risk organism a 
therapeutically or prophylactically effective amount of a compound active on a target 
of a bacteriophage inhibitor protein or RNA, or as described for the previous aspect. 
Preferably the bacterium involved in the infection or risk of infection produces the 
35 identified target of the bacteriophage inhibitor protein or alternatively produces-a 

homologous target compound. In preferred embodiments, the host organism is a plant 
or animal, preferably a mammal or bird, and more preferably, a human or other 
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mammal described herein. Preferred embodiments include, without limitation, those 
as described for the preceding aspect. 

Compounds useful for the methods of inhibiting, methods of treating, and 
pharmaceutical compositions can include novel compounds, but can also include 
5 compounds which had previously been identified for a purpose other than inhibition 
of bacteria. Such compounds can be utilized as described and can be included in 
pharmaceutical compositions. 

In preferred embodiments of this and other aspects of the invention utilizing 
bacterial target sequences of a bacteriophage inhibitory ORF product, the target 

10 sequence is encoded by a Staphylococcus nucleic acid coding sequence, preferably S. 
aureus, a Streptococcus nucleic acid coding sequence, preferably Streptococcus 
pneumoniae, or Enterococcus nucleic acid coding sequence. Possible target 
sequences are described herein by reference to sequence source sites. 

The amino acid sequence of a polypeptide target is readily provided by 

15 translating the corresponding coding region. For the sake of brevity, the sequences 
are not reproduced herein. For the sake of brevity, the sequences are described by 
reference to the GenBank entries instead of being written out in full herein. In cases 
where the TIGR or GenBank entry for a coding region is not complete, the complete 
sequence can be readily obtained by routine methods, e.g., by isolating a clone in a 

20 phage host genomic library, and sequencing the clone insert to provide the relevant 
coding region. The boundaries of the coding region can be identified by conventional 
sequence analysis and/or by expression in a bacterium in which the endogenous copy 
of the coding region has been inactivated and using subcloning to identify the 
functional start and stop codons for the coding region. 

25 In the context of nucleic acid or amino acid sequences of this invention, the 

term "corresponding" indicates that the sequence is at least 95% identical, preferably 
at least 97% identical, and more preferably at least 99% identical to a sequence from 
the specified phage genome, a ribonucleotide equivalent, a degenerate equivalent 
(utilizing one or more degenerate codons), or a homologous sequence, where the 

30 homolog provides functionally equivalent biological function. 

By "treatment" or "treating" is meant administering a compound or 
pharmaceutical composition for prophylactic and/or therapeutic purposes. The term 
"prophylactic treatment" refers to treating a patient or animal that is not yet infected 
but is susceptible to or otherwise at risk of a bacterial infection. The term "therapeutic _ _ 

35 treatment" refers to administering treatment to a patient already suffering fronu " 
infection. 



WO 00/32825 



PCT/IB99/02040 



25 

The term "bacterial infection" refers to the invasion of the host organism, 
animal or plant, by pathogenic bacteria. This includes the excessive growth of bacteria 
which are normally present in or on the body of the organism, but more generally, a 
bacterial infection can be any situation in which the presence of a bacterial 
population(s) is damaging to a host organism. Thus, for example, an organism suffers 
from a bacterial population when excessive numbers of a bacterial population are 
present in or on the organism's body, or when the effects of the presence of a bacterial 
population(s) is damaging to the cells, tissue, or organs of the organism. 

The terms "administer", "administering", and "administration" refer to a 
method of giving a dosage of a compound or composition, e.g., an antibacterial 
pharmaceutical composition, to an organism. Where the organism is a mammal, the 
method is, e.g., topical, oral, intravenous, transdermal, intraperitoneal, intramuscular, 
or intrathecal. The preferred method of administration can vary depending on various 
factors, e.g., the components of the pharmaceutical composition, the site of the 
potential or actual bacterial infection, the bacterium involved, and the infection 
severity. 

The term "mammal" has its usual biological meaning referring to any 
organism of the Class Mammalia of higher vertebrates that nourish their young with 
milk secreted by mammary glands, e.g., mouse, rat, and, in particular, human, bovine, 
sheep, swine, dog, and cat. 

In the context of treating a bacterial infection a "therapeutically effective 
amount" or "pharmaceutically effective amount" indicates an amount of an 
antibacterial agent, e.g., as disclosed for this invention, which has a therapeutic effect. 
This generally refers to the inhibition, to some extent, of the normal cellular 
functioning of bacterial cells that renders or contributes to bacterial infection. 

The dose of antibacterial agent that is useful as a treatment is a 
"therapeutically effective amount." Thus, as used herein, a therapeutically effective 
amount means an amount of an antibacterial agent that produces the desired 
therapeutic effect as judged by clinical trial results and/or animal models. This amount 
can be routinely determined by one skilled in the art and will vary depending on 
several factors, such as the particular bacterial strain involved and the particular 
antibacterial agent used. 

In connection with claims to methods of inhibiting bacteria and therapeutic or 
prophylactic treatments, "a compound active on a target of a bacteriophage inhibitor 
protein" or terms of equivalent meaning differ from administration of or contactwTth 
an intact phage naturally encoding the full-length inhibitor compound. While an 
intact phage may conceivably be incorporated in the present methods, the method at 
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least includes the use of an active compound as specified different from a full length 
inhibitor protein naturally encoded by a bacteriophage and/or a delivery or contacting 
method different from administration of or contact with an intact phage encoding the 
full-length protein. Similarly, pharmaceutical compositions described herein at least 
include an active compound different from a frill-length inhibitor protein naturally 
encoded by a bacteriophage or such a full-length protein is provided in the 
composition in a form different from being encoded by an intact phage. Preferably 
the methods and compositions do not include an intact phage. 

In accord with the above aspects, the invention also provides antibacterial 
agents and compounds active on bacterial targets of bacteriophage inhibitor proteins 
or RNAs, where the target was uncharacterized as indicated above. As previously 
indicated, such active compounds include both novel compounds and compounds 
which had previously been identified for a purpose other than inhibition of bacteria. 
Such previously identified biologically active compounds can be used in 
embodiments of the above methods of inhibiting and treating. In preferred 
embodiments, the targets, bacteriophage, and active compound are as described herein 
for methods of inhibiting and methods of treating. Preferably the agent or compound 
is formulated in a pharmaceutical composition which includes a pharmaceutically 
acceptable carrier, excipient, or diluent. In addition, the invention provides agents, 
compounds, and pharmaceutical compositions where an active compound is active on 
an uncharacterized phage-specific site. 

In preferred embodiments, the target is as described for embodiments of 
aspects above. 

Likewise, the invention provides a method of making an antibacterial agent. 
The method involves identifying a target of a bacteriophage inhibitor polypeptide or 
protein or RNA, screening a plurality of compounds to identify a compound active on 
the target, and synthesizing the compound in an amount sufficient to provide a 
therapeutic effect when administered to an organism infected by a bacterium naturally 
producing the target. In preferred embodiments, the identification of the target and 
identification of active compounds include steps or methods and/or components as 
described above (or otherwise herein) for such identification. Likewise, the active 
compound can be as described above, including fragments and derivatives of phage 
inhibitor proteins, peptidomimetics, and small molecules. As recognized by those 
skilled in the art, peptides can be synthesized by expression systems and purified, or 
can be synthesized artificially. In preferred embodiments the inhibitory phage QRF~ 
products is from S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus 
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pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, 
or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014. 

As indicated above, sequence analysis of nucleotide and/or amino acid 
sequences can beneficially utilize computer analysis. Thus, in additional aspects the 
5 invention provides computer-related hardware and media and methods utilizing and 
incorporating sequence data from uncharacterized phage, e.g., uncharacterized phage 
listed in Table 1, preferably at least one of Staphylococcus aureus phage S. aureus 
phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae phage Dp-1 ORF 001, 
002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or Enterococcus sp. phage 
10 1 82 ORF 002, 008, or 014, or 44 AHJD, Enterococcus sp. phage 1 82, or 

Streptococcus pneumoniae phage Dp-1. In general, such aspects can facilitate the 
above-described aspects. Various embodiments involve the analysis of genetic 
sequence and encoded products, as applied to the evaluating bacteriophage inhibitor 
ORFs and compounds and fragments related thereto. The various sequence analyses, 

1 5 as well as function analyses, can be used separately or in combination, as well as in 
preceding aspects and embodiments. Use in combination is often advantageous as the 
additional information allows more efficient prioritizing of phage ORFs for 
identification of those ORFs that provide bacteria-inhibiting function. 

In one aspect, the invention provides a computer-readable device which 

20 includes at least one recorded amino acid or nucleotide sequence corresponding to one 
of the specified phage and a sequence analysis program for analyzing a nucleotide 
and/or amino acid sequence. The device is arranged such that the sequence 
information can be retrieved and analyzed using the analysis program. The analysis 
can identify, for example, homologous sequences or the indicated %s of the phage 

25 genome and structural motifs. Preferably the sequence includes at least 1 phage ORF 
or encoded product, more preferably at least 10%, 20%, 30%, 40%, 50%, 70%, 90%, 
or 100% of the genomic phage ORFs and/or equivalent cDNA, RNA, or amino acid 
sequences. Preferably the sequence or sequences in the device are recorded in a 
medium such as a floppy disk, a computer hard drive, an optical disk, computer 

30 random access memory (RAM), or magnetic tape. The program may also be recorded 
in such medium. The sequences can also include sequences from a plurality of 
different phage. 

In this context, the term "corresponding" indicates that the sequence is at least 
95% identical, preferably at least 97% identical, and more preferably at least 99%. 
35 identical to a sequence from the specified phage genome, a ribonucleotide equivalent, 
a degenerate equivalent (utilizing one or more degenerate codons), or a homologous 
sequence, where the homolog provides functionally equivalent biological function. 
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Similarly, the invention provides a computer analysis system for identifying 
biologically important portions of a bacteriophage genome. The system includes a 
data storage medium, e.g., as identified above, which has recorded thereon a 
nucleotide sequence corresponding to at least a portion of at least one uncharacterized 
5 bacteriophage genome, a set of program instructions to allow searching of the 
sequence or sequences to analyze the sequence, and an output device where the 
portion includes at least the sequence length as specified in the preceding aspect. The 
output device is preferably a printer, a video display, or a recording medium. More 
one than one output device may be included. For each of the present computer-related 

1 0 asepcts, the bacteriophage are preferably selected from the uncharacterized phage 
listed in Table 1, more preferably from bacteriophage 77, 3 A, 96, 44 AHJD (S. 
aureus), Dp-1 (Streptococcus pneumoniae), or 182 (Enter vcoccus). 

In keeping with the computer device aspects, the invention also provides a 
method for identifying or characterizing a bacteriophage ORF by providing a 

15 computer-based system for analyzing nucleotide or amino acid sequences, e.g., as 
describe above. The system includes a data storage medium which has recorded a 
sequences or sequences as described for the above devices, a set of instructions as in 
the preceding aspect, and an output device as in the preceding aspect. The method 
further involves analyzing at least one sequence, and outputting the analysis results to 

20 at least one output device. 

In preferred embodiments, the analysis identifies a sequence similarity or 
homology with a sequence or sequences selected from bacterial ORFs encoding 
products with related biological function; ORFs encoding known inhibitors; and 
essential bacterial ORFs. Preferably the analysis identifies a probable biological 

25 function based on identification of structural elements or characteristic or signature 
motifs of an encoded product or on sequence similarity or homology. Preferably the 
uncharacterized bacteriophage is from Table 1, more preferably at least one of 
bacteriophage 77, 3 A, 96, 44 AHJD (S. aureus), Dp-1 (Streptococcus pneumoniae), or 
182 (Enterococcus). In preferred embodiments, the method also involves determining 

30 at least a portion of the nucleotide sequence of at least one uncharacterized 

bacteriophage as indicated, and recording that sequence on data storage medium of the 
computer-based system. In preferred embodiments, the analysis identifies a sequence 
similarity of homology with a S. aureus phage 44 AHJD ORF 1, 9, or 12, 
Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021,^ „ , 

35 029, 030, 038, or 041 , or Enterococcus sp. phage 1 82 ORF 002, 008, or 014. - ~~ 
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As used in the claims to describe the various inventive aspects and 
embodiments, "comprising" means including, but not limited to, whatever follows the 
word "comprising". Thus, use of the term "comprising" indicates that the listed 
elements are required or mandatory, but that other elements are optional and may or 
5 may not be present. By "consisting of is meant including, and limited to, whatever 
follows the phrase "consisting of. Thus, the phrase "consisting of" indicates that the 
listed elements are required or mandatory, and that no other elements may be present. 
By "consisting essentially of is meant including any elements listed after the phrase, 
and limited to other elements that do not interfere with or contribute to the activity or 

10 action specified in the disclosure for the listed elements. Thus, the phrase "consisting 
essentially of 1 indicates that the listed elements are required or mandatory, but that 
other elements are optional and may or may not be present depending upon whether or 
not they affect the activity or action of the listed elements. 

Further embodiments will be apparent from the following Detailed Description 

1 5 and from the claims. 



BRIEF DESCRIPTION OF THE DRAWINGS 

20 FIGURE 1 A and IB are flow schematics showing the manipulations used to 

convert pT0021, an arsenite inducible vector containing the luciferase gene, into 
pTHA or pTM, two ars inducible vectors. Vector pTHA contains BamH I, Sal I, and 
Hind III cloning sites and a downstream HA epitope tag. Vector pTM contains Bam 
HI and Hind III cloning sites and no HA epitope tag. 

25 

FIGURE 2 is a schematic representation of the cloning steps involved to place 
the DNA segments of any of ORFs 17/ 19/ 43/ 102/104/182 or other sequences into 
pTHA to s^sess inhibitory potential. For subcloning into pTM or pT0021, Individual 
ORFs were amplified by the PCR using oligonucleotides targeting the ATG and stop 
30 codons of the ORFs. Using this strategy, Bam HI and Hind III sites were positioned 
immediately upstream or downstream, respectively of the start and stop codons of 
each ORF. Following digestion with Bam HI and Hind III, the PCR fragments were. _ - * 
subcloned into the same sites of pT002 1 or pTM. Clones were verified by PCFfand 
direct sequencing. 
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FIGURE 3 shows a schematic representation of the functional assays used to 
characterize the bactericidal and bacteriostatic potential of all predicted ORFs (>33 
amino acids) encoded by bacteriophage 77. Fig. 3 A) Functional assay on semi-solid 
support media. Fig. 3B) Functional assay in liquid culture. 

FIGURE 4A, B, and C is a bar graph showing the results of a screen in liquid 
media to assess bacteriostatic or bactericidal activity of 93 predicted ORFs (>33 
amino acids) encoded by bacteriophage 77/ Growth inhibition assays were performed 
as detailed in the Detailed Description. The relative growth of Staphylococcus aureus 
transformants harboring a given bacteriophage 77 ORF (identified on the bottom of 
the graph), in the absence or presence of arsenite, is plotted relative to growth of a 
Staphylococcus aureus transformant containing ORF 5, a non-toxic bacteriophage 77 
ORF (which is set at 100%). Each bar represents the average obtained from three 
Staph A transformants grown in duplicate. Bacteriophage 77 ORFs showing 
significant growth inhibition consist of ORFs 17, 19, 102, 104, and 182. 

FIGURE 5 shows a block diagram of major components of a general purpose 
computer. 

FIGURE 6 shows an ORF map for Streptococcus pneumoniae bacteriophage 
Dp-1 showing the ORF identifiers, genomic locations, and orientations of the 85 
identified ORFs that were found to have ribosomal binding sites and thus are expected 
to be expressed. 

FIGURE 7 shows a schematic representation of the arsenite-inducible 
expression system present in a shuttle vector designed to express individual 
Streptococcus bacteriophage Dp-1 ORFs in Streptococcus. Various modifications can 
be readily made to such a vector, or other vectors can be readily constructed to 
provide inducible expression of ORFs in a particular host bacterium using well-known 
techniques. 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The invention may be more clearly understood from the following description. 
5 The tables will first be briefly described. 

Table 1 is a listing of a large number of available bacteriophage that can be 
readily obtained and used in the present invention. 

Table 2 shows the complete nucleotide sequence of the genome of 
Staphylococcus aureus bacteriophage 77. 
10 Table 3 shows a list of all the ORFs from Bacteriophage 77 that were screened 

in the functional assay to identify those with anti-microbial activity. 

Table 4 shows the predicted nucleotide sequence, predicted amino acid 
sequence, and physiochemical parameters of ORF 17/ 19/43/ 102/ 104/ 182]. These 
include the primary amino acid sequence of the predicted protein, the average 
1 5 molecular weight, amino acid composition, theoretical pi, hydrophobicity map, and 
predicted secondary structure map. 

Table 5 shows homology search results. BLAST analysis was performed with 
ORFs 17/19/ 43/ 1 02/ 1 04/ 1 82 against NCBI non-redundant nucleotide and 
Swissprot databases. The results of this search indicate that: I) ORF 17 has no 
20 significant homology to any gene in the NCBI non-NCBI non-redundant nucleotide 
database, II) ORF 19 has significant homology to one gene in the NCBI non- 
redundant nucleotide database - the gene encoding ORF 59 of bacteriophage phi PVL, 
III) ORF 43 has significant homology to one gene in the NCBI non-redundant 
nucleotide database - the gene encoding ORF 39 of phi PVL, IV) ORF 102 has 
25 significant homology to one gene in the NCBI non-redundant nucleotide database - 
the gene encoding ORF 38 of phi PVL, V) ORF 104 has no significant homology to 
any gene in the NCBI non-redundant nucleotide database, VI) ORF 182 has 
significant homology to one gene in the NCBI non-redundant nucleotide database - 
the gene encoding ORF 39 of phi PVL. 
30 Table 6 is a table from Alberts et al., MOLECULAR BIOLOGY OF THE 

CELL 3 rd ed., showing the redundancy of the "universal" genetic code. * ^ — ~" 

Table 7 shows the complete nucleotide sequence of Staphylococcus aureus 
bacteriophage 3A. 
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Table 8 is a listing of the ORFs identified in Staphylococcus aureus 
bacteriophage 3A. 

Table 9 shows the complete nucleotide sequence of Staphylococcus aureus 
bacteriophage 96. 

5 Table 1 0 is a listing of the ORFs identified in Staphylococcus aureus 

bacteriophage 96. 

Table 1 1 is a listing of sequences deposited in the NCBI public database 
(GeneBank) for bacteriophage listed in Table 1 . 

Table 12 is a listing of phage which encode a known lysis function , including 
10 the identified lysis gene. 

Table 13 is a listing of bacteriophage which encode holin genes, where holin 
genes encode proteins which form pores and eventually enable other enzymes to kill 
the host bacterium. 

Table 14 is a listing of bacteriophage which encode kil genes. 

15 Table 15 is a list of Staphylococcus aureus sequences identified by accession 

number which may include sequences from genes coding for target sequences for the 

phage 77-encoded antimicrobial proteins or peptides. The sequences were obtained 

by searching GenBank for listings. 

Table 16 shows the nucleotide sequence of the genome of Staphylococcus 

20 aureus phage 44 AHJD. 

Table 17 lists and shows the sequence position of the 73 ORFs predicted to be 

encoded by Staphylococcus aureus bacteriophage 44 AHJD that are greater than 33 

amino acids. 

Table 18 shows the ORF sequences and putative amino acid sequences for the 
25 Staphylococcus aureus bacteriophage 44AHJD ORFs greater than 33 amino acids. 

Table 19 shows the similarities in sequence identified between predicted 
Staphylococcus aureus bacteriophage 44 AHJD ORFs and sequences present in public 
databases. 

Table 20 shows the homology alignments between predicted Staphylococcus 
30 aureus bacteriophage 44 AHJD ORFs and the corresponding protein sequences present 
in public sequence databases. 

Table 21 shows the complete nucleotide sequence of the genome of 
Enterococcus bacteriophage 182. - — ~ 

Table 22 lists and shows the sequence position of the 80 ORFs identified in 
35 bacteriophage 182 and that are greater than 33 amino acids. 
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Table 23 shows the nucleotide and predicted amino acid sequence of all 80 
ORFs identified in bacteriophage 182. 

Table 24 shows the similarities identified to date in sequence between 
Enterococcus phage 182 ORFs greater than 33 amino acids and sequences present in 
5 public sequence databases. 

Table 25 shows the predicted amino acid sequence as well as the predicted 
secondary structures map for two Enterococcus bacteriophage 182 ORFs. 

Table 26 shows the homology alignments between predicted Enterococcus 
bacteriophage 182 ORFs and the corresponding protein sequences present in public 
1 0 sequence databases. 

Table 27 list Enterococcus sequences listed in GenBank providing possible 
Enterococcal target sequences for inhibitory Enterococcus bacteriophage 1 82 ORFs 
and other compounds with antibacterial activity. 

Table 28 shows the complete nucleotide sequence of the genome of 
1 5 Streptococcus bacteriophage Dp- 1 . 

Table 29 lists and shows sequence position of the 273 ORFs identified in 
Pneumococcal bacteriophage Dp-1 that are greater than 33 amino acids, 85 of which 
are predicted to be expressed in Dp-1 as having a ribosomal binding site. That set of 
85 ORFs is shown in the attached drawings. 
20 Table 30 shows the nucleotide and predicted amino acid sequence of all 273 

ORFs identified in bacteriophage Dp-1 that are identified as being expressed. 

Table 3 1 shows the similarities identified in sequence between Streptococcus 
phage Dp-1 ORFs greater than 33 amino acids and sequences present in public 
sequence databases. 

25 Table 32 shows the 473 1 bp sequence of Dp-1 published by Sheehan et al., 

1997). 

Table 33 lists Streptococcus pneumoniae sequences listed in GenBank 
providing possible target sequences for inhibitory Streptococcus pneumoniae 
bacteriophage Dp-1 ORFs and other compounds with antibacterial activity 

30 

Background: 

As indicated above, the present invention is concerned, in part, with the use of 
bacteriophage coding sequences and the encoded polypeptides or RNA transcripts to - , 
identify bacterial targets for potential new antibacterial agents. Thus, the invention 
35 concerns the selection of relevant bacteria. Particularly relevant bacteria are those 
which are pathogens of a complex organism such as an animal, e.g., mammals, 
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reptiles, and birds, and plants. Examples include Stapylococcus aureus, Enterococcus 
species, and Streptococcus pneumoniae. However, the invention can be applied to 
any bacterium (whether pathogenic or not) for which bacteriophage are available or 
which are found to have cellular components closely homologous to components 
5 targeted by phage of another bacterium. 

Thus, the invention also concerns the bacteriophage which can infect a 
selected bacterium. Identification of ORFs or products from the phage which inhibit 
the host bacterium both provides an inhibitor compound and allows identification of 
the bacterial target affected by the phage-encoded inhibitor. Such targets are thus 
10 identified as potential targets for development of other antibacterial agents or 

inhibitors and the use of those targets to inhibit those bacteria. As indicated above, 
even if such a target is not initially identified in a particular bacterium, such a target 
can still be identified if a homologous target is identified in another bacterium. 
Usually, but not necessarily, such another bacterium would be a genetically closely 

15 related bacterium. Indeed, in some cases, a phage-encoded inhibitor can also inhibit 
such a homologous bacterial cellular component. 

The demonstration that bacteriophage have adapted to inhibiting a host 
bacterium by acting on a particular cellular component or target provides a strong 
indication that that component is an appropriate target for developing and using 

20 antibacterial agents, e.g., in therapeutic treatments. Thus, the present invention 

provides additional guidance over mere identification of bacterial essential genes, as 
the present invention also provides an indication of accessability of the target to an 
inhibitor, and an indication that the target is sufficiently stable over time {e.g., not 
subject to high rates of mutation) as phage acting on that target were able to develop 

25 and persist. Thus, the present invention identifies a subset of essential cellular 

components which are particularly likely to be appropriate targets for development of 
antibacterial agents. 

The invention also, therefore, concerns the development or identification of 
inhibitors of bacteria, in addition to the phage-encoded inhibitory proteins (or RNA 

30 transcripts), which are active on the targets of bacteriophage-encoded inhibitors. As 
described herein, such inhibitors can be of a variety of different types, but are 
preferably small molecules. 

The following description provides preferred methods for use in the various 
aspects of the invention. However, as those skilled in the art will readily recognize, 

35 other approaches can be used to obtain and process relevant information. Thus4he~ 
invention is not limited to the specifically described methods. In addition, the 
following description provides a set of steps in a particular order. That series of steps 
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describes the overall development involved in the present invention. However, it is 
clear that individual steps or portions of steps may be usefully practiced separately, 
and, further, that certain steps may be performed in a different order or even bypassed 
if appropriate information is already available or is provided by other sources or 
methods. 

Selecting and Growing Phage, and Isolating DNA 

Conceptually, the first step involves selecting bacterial hosts of interest. 
Preferably, but not necessarily, such hosts will be pathogens of clinical importance. 
Alternatively, because bacteria all share certain fundamental metabolic and structural 
features, these features can be targeted for study in one strain, for example a 
nonpathogenic one, and extrapolated to similarly succeed in pathogenic ones. 
Nonpathogenic strains may also exhibit initial advantages in being not only less 
dangerous, but also, for example, in having better growth and culturing characteristics 
and/or better developed molecular biology techniques and reagents. Consequently, 
advantageously the invention provides the ability target virtually any bacteria, but 
preferably pathogenic bacteria, with antimicrobial compounds designed and/or 
developed using bacteriophage inhibitory proteins and peptides from phage with non- 
pathogenic and/or pathogenic hosts. 

We have selected Staphylococcus aureus, Streptococcus pneumoniae, various 
Enterococci, and Pseudomonas aeruginosa as initial exemplary pathogens. These 
bacteria are a major cause of morbidity and mortality in hospital-based infections, and 
the appearance of antibiotics resistance in all three organisms makes it increasingly 
difficult to treat benign infections involving these organisms. Such infections can 
include, for example, otitis media, sinusitis, and skin, and airway infections (Neu, 
H.C. (1992). Science 257, 1064-1073). However, the approach described below is 
clearly applicable to any human bacterial pathogens including but not restricted to 
Mycobacterium tuberculosis, Nesseria gonorrhoeae, Haemophilus influenza, 
Acinobacter, Escherichia coli, Shigella dysenteria, Streptococcus pyogenes, 
Helicobacter pylori, and Mycoplasma species. This invention can also be applied to 
the discovery of anti-bacterial compounds directed against pathogens of animals other 
than humans, for example, sheep, cattle, swine, dogs, cats, birds, and reptiles. 
Similarly, the invention is not limited to animals, but also applies to plants and plant 
pathogens. 

In general, the bacteria are grown according to standard methodologies - 
employed in the art, including solid, semi-solid or liquid culturing, which procedures 
can be found in or extrapolated from standard sources such as Maloy, S.R., Stewart, 
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V.J., and Taylor, R.K. Genetic Analysis of Pathogenic Bacteria (1996) Cold Spring 
Harbor Laboratory Press, or Maniatis, T. et al. (1989) Molecular Cloning: A 
Laboratory Manual. Cold Spring Harbor University Press, Cold Spring, N.Y.; or 
Ausubel, F.M. et al. (1994) Current Protocols in Molecular Biology . John Wiley & 
Sons, Secaucus, NJ. Culture conditions are selected which are adapted to the 
particular bacterium generally using culture conditions known in the art as 
appropriate, or adaptations of those conditions. 

Nucleic acids within these bacteria can be routinely extracted through common 
procedures such as described in the above-referenced manuals and as generally known 
to those skilled in the art. Those nucleic acid stocks can then be used to practice the 
other inventive aspects described below. 

Selection and Growth of Bacteriophage, and Isolation of DNA 

The second step involves assembling a group of bacteriophages (phage 
collection) for one or more of the targeted bacterial hosts. While the invention can be 
utilized with a single bacteriophage for a pathogen or other bacterium, it is preferable 
to utilize a plurality of phage for each bacterium, as comparisons between a plurality 
of such phage provides useful additional information. Non-limiting examples of 
phage and sources for some of the above-mentioned pathogenic bacteria are found in 
Table 1. The criteria used to select such phages is that they are infectious for the 
microbe targeted, and replicate in, lyse, or otherwise inhibit growth of the bacterium 
in a measurable fashion. These phages can be very different from one another 
(representing different families), as judged by criteria such as morphology (head, tail, 
plate, etc.), and similarity of genome nucleotide sequence (cross-hybridization). Since 
such diverse bacteriophages are expected to block bacterial host metabolism and 
ultimately inhibit by a variety of mechanisms, their combined study will lead to the 
identification of different mechanisms by which the phages independently inhibit 
bacterial targets. Examples include degradation of host DNA (Parson K.A., and 
Snustad, D.P. (1975). J. Virol 15, 221-444) and inhibition of host RNA transcription 
(Severinova, E., Severinov, K. and Darst, S.A. (1998/ J.Mol Biol 279, 9-18). This, 
in turn, yields novel information on phage proteins that can inhibit the targeted 
microbe. As explained below, this 1) forms the basis of novel drug discovery efforts 
based on knowledge of the primary amino acid sequence of the phage inhibitor 
protein {e.g., peptide fragments or peptidomimetics) and/or 2) leads to the 
identification of bacterial biochemical pathways, the proteins of which are essentiaTor 
significant for survival of the targeted microbe, and which enzymatic steps or 
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chemical reactions can be targeted by classical drug discovery methods using 
molecular inhibitors, for example, small molecule inhibitors. 

Bacteriophage are generally either of two types, lytic or filamentous, meaning 
they either outright destroy their host and seek out new hosts after replication, or else 
5 continuously propogate and extrude progeny phage from the same host without 
destroying it. Regardless of the phage life cycle and type, preferred embodiments 
incorporate phage which impede cell growth in measurable fashion and preferably 
stop cell growth. To this end, lytic phage are preferred, although certain nonlytic 
species may also suffice, e.g., if sufficiently bacteriostatic. 

10 Various procedures that are commonly understood by those of skill in the art 

can be routinely employed to grow, isolate, and purify phage. Such procedures are 
exemplified by those found in such common laboratory aids such as Maloy, S.R., 
Stewart, V.J., and Taylor, R.K. Genetic Analysis of Pathogenic Bacteria (1996) Cold 
Spring Harbor Laboratory Press; Maniatis, T. et al. (1989) Molecular Cloning: A 

1 5 Laboratory Manual. Cold Spring Harbor University Press, Cold Spring, N.Y.; and 
Ausubel, F.M. et al. (eds.) (1994) Current Protoc ols in Molecular Biology. John 
Wiley & Sons, Secaucus, N.J. The techniques generally involve the culturing of 
infected bacterial cells that are lysed naturally and/or chemically assisted, for 
example, by the use of an organic solvent such as chloroform that destroys the host 

20 cells thereby liberating the phage within. Following this, the cellular debris is 

centrifuged away from the supernatant containing the phage particles, and the phage 
then subsequently and selectively precipitated out of the supernatant using various 
methods usually employing the use of alcohols and/or other chemical compounds 
such as polyethylene glycol (PEG). The resulting phage can be further purified using 

25 various density gradient/centrifugation methodologies. The resulting phage are then 
chemically lysed, thereby releasing their nucleic acids that can be conveniently 
precipitated out of the supernatant to yield a viral nucleic acid supply of the phage of 
interest. 

Exemplary bacteriophage are indicated in Table 1, along with sources where 
30 those phage may be obtained. 

Exemplary bacteria include the reference bacteria for the identified 
bacteriophage, available from the same sources. 

Characterizing Bacteriophage Genomes for ORFs 
35 The third step involves systematically characterizing the genetic information 

contained in the phage genome. Within this genetic information is the sequence of all 
RNAs and proteins encoded by the phage, including those that are essential or 
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instrumental in inhibiting their host. This characterization is preferably done in a 
systematic fashion. For example, this can be done by first isolating high molecular 
weight genomic DNA from the phage using standard bacterial lysis methods, followed 
by phage purification using density gradient ultracentrifugation, and extraction of 
5 nucleic acid from the purified phage preparation. The high molecular weight DNA is 
then analyzed to determine its size and to evaluate a proper strategy for its sequencing. 
The DNA is broken down into smaller size fragments by sonication or partial 
digestion with frequently cutting restriction enzymes such as Sau3A to yield 
predominantly 1 to 2 kilobase length DNA, which DNA can then be resolved by gel 

10 electrophoresis followed by extraction from the gel. 

The ends of the fragments are enzymatically treated to render them suitable for 
cloning and the pools of fragments are cloned in a bacterial plasmid to generate a 
library of the phage genome. Several hundred of these random DNA fragments 
contained in the plasmid vector are isolated as clones after introduction into an 

1 5 appropriate bacterium, usually Escherichia colL They are then individually expanded 
in culture and the DNA from each individual clone is purified. The nucleotide 
sequences of the inserts of these clones are determined by standard automated or 
manual methods, using oligonucleotide primers located on either side of the cloning 
site to direct polymerase mediated sequencing (e.g., the Sanger sequencing method or 

20 a modification of that method). Other sequencing methods can also be used. 

The sequence of individual clones is then deposited in a computer, and 
specific software programs (for example, Sequencher™, Gene Codes Corp.) are used 
to look for overlap between the various sequences, resulting in ordering of contig 
sequences and ultimately providing the complete sequence of the entire bacteriophage 

25 genome (one such example is given in Table 2 for Staphylococcus aureus 
bacteriophage 77; others are also provided herein). This complete nucleotide 
sequence is preferably determined with a redundancy of at least 3- to 5-fold (number 
of independent sequencing events covering the same region) in order to minimize 
sequencing errors. 

30 Preferably, the bacterial strain used as a phage host should not possess any 

other innate plasmids, transposons, or other phage or incompatible sequences that 
would complicate or otherwise make the various manipulations and analyses more 
difficult. 

Commercially available computer software programs are used to translate the 
35 nucleotide sequence of the phage to identify all protein sequences encoded by the 
phage (hereafter called open reading frames or ORFs). (Customized software can 
clearly also be used.) As phages are known to transcribe their genome into RNA from 
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both strands, in both directions, and sometimes in more than one frame for the same 
sequence, this exercise is done for both strands and in all six possible reading frames. 
As evolutionary constraints have forced the phage to conserve all of its vital protein 
sequences in as small a genome as possible, it is straightforward to identify all the 
proteins encoded by the phage by simple examination of the 6 translation frames of 
the genome. Once these ORFs are identified, they are cataloged into a phage 
proteome database (Table 3 lists ORFs identified from phage 77; ORF lists are also 
provided for other exemplary phage). This analysis is preferably performed for each 
phage under study. The process of ORF identification can be varied depending on the 
desired results. For example, the minimum length for the putative encoded 
polypeptide can be varied, and/or putative coding regions that have an associated 
Shine-Dalgarno sequence can be selected. In the case of phage 77 ORFs, such 
parameter adjustment was performed and resulted in the identification of ORFs as 
listed herein. Different parameters had resulted in the identification of the ORFs 
listed in the preceding U.S. Provisional Application 60/1 10,992, filed December 3, 
1998, which is hereby incorporated by reference in its entirety. 

Exemplary phage 77 ORFs identified in that provisional application and as 
identified herein are shown in the following table: 



ORFID 
from 

60/110,992 


Genomic 
position 


a.a. 
size 


Start 
codon 


ORFID 

from 

241/190 


Genomic 
position 


a.a. 
size 


Start 
codon 


77ORF016 


2369-24024 


251 


TTG 


77ORF017 


23269-23982 


237 


ATG 


77ORF019 


39845-40501 


218 


ATA 


77ORF019 


39851-40501 


216 


ATG 


77ORF050 


29268-29564 


98 


ATG 


770RF182 


29268-29564 


98 


ATG 


77ORF050 


29268-29564 


98 


ATG 


77ORF043 


29304-29564 


86 


ATG 


77ORF067 


34312-34551 


79 


CTG 


77ORF104 


34393-34551 


52 


ATG 


770RF146 


29051-29212 


53 


ATG 


77ORF102 


29051-29212 


53 


ATG 



IdentifVine and Characterizing Inhibitory Phage ORFs 

The fourth step entails identifying the phage protein or proteins or RNA 
transcripts that have the ability to inhibit their bacterial hosts. This can be 
accomplished, for example, by either or both of two non-mutually exclusive methods. 
The first method makes use of bioinformatics. Over the past few years, a large amount 
of nucleotide sequence information and corresponding translated products have 
become available through large genome sequencing projects for a variety of 
organisms including mammals, insects, plants, unicellular eukaryotes (yeast anji ~~ 
fungi), as well as several bacterial genomes such as £. coli, Mycobacterium 
tuberculosis, Bacillus subtilis, Staphylococcus aureus and many others. Such 
sequences have been deposited in public databases (for example, non-redundant 
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sequence database at GenBank and SwissProt protein sequence database) 
(http://www.ncbi.nlm.nih.gov)) and can be freely accessed to compare any specific 
query sequence to those present in such databases. For example, GenBank contains 
over 1.6 billion nucleotides corresponding to 2.3 million sequence records. Several 
5 computer programs and servers (eg., TBLASTN) have been created to allow the rapid 
identification of homology between any given sequence from one organism to that of 
another present in such databases, and such programs are public and available free of 
charge. 

In addition, it has been well established that basic biochemical pathways can 
10 be conserved in very distant organisms (for example bacteria and man), and that the 
proteins performing the various enzymatic steps in these pathways are themselves 
conserved at the amino acid sequence level. Thus, proteins performing similar 
functions (e.g. DNA repair, RNA transcription, RNA translation) have frequently 
preserved key structural signatures, identifiable by similarities across regions of 

1 5 proteins (domains and motifs). The antimicrobials of the present invention will 
preferably target features and targets that are highly characteristic or conserved in 
microbes, and not higher organisms. 

Most genomes encode individual proteins or groups of proteins that can be 
assembled into protein families that have been evolutionarily conserved. Therefore, 

20 similarity between a new query sequence and that of a member of a protein family 
(reference sequences from public databases) can immediately suggest a biochemical 
function for the novel query sequence, which in our case is a phage ORF. 

The sequence homology between individual members of evolutionarily distant 
members of a protein family is usually not randomly distributed along the entire 

25 length of the sequence but is often clustered into "motifs" and "domains". These 
correspond to key three-dimensional folds that form key catalytic and/or regulatory 
structures that perform key biochemical function(s) for the group of proteins. 
Commercially available computer software programs can identify such motifs in a 
new query sequence, again providing functional information for the query sequence. 

30 Such structural and functional motifs have also been derived from the combined 
analysis of primary sequence databases (protein sequences) and protein structure 
databases (X-ray crystallography, nuclear magnetic resonance) using so-called 
"threading" methods (Rost B,l and Sander C. (1996) Ann. Rev. Biophy. Biomol. 
Struct. 25,113-136). 

35 Such motifs and folds are themselves deposited in public databases which can 

be directly accessed (for example, SwissProt database; 3D-ALI at EMBL, Heidelberg; 
PROSITE). This basic exercise leads to a structural homology map in which each of 
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the phage ORFs has been probed for such similarities, and where initial structural and 
functional hits are identified (selected examples of sequence homologies detected 
between individual ORFs from the genome of Staphylococcus aureus bacteriophage 
77 and sequences deposited in public databases are shown in Table 5 for ORFs 
5 17/19/43/102/104/182). 

This analysis can point out phage proteins with similarity to proteins from 
other phages (such as those for £. coli) playing an important role in the basic 
biochemical pathways of the phage (such as DNA replication, RNA transcription, 
tRNAs, coat protein and assembly). Selected examples of such proteins include 
10 integrase and capsid protein. Therefore, this analysis enables identification and 

elimination of non-essential ORFs as candidates for an inhibitor function, as well as 
the identification of (potentially) useful ones. 

In addition, this analysis can point out specific ORFs as possible inhibitor 
ORFs. For example these ORFs may encode proteins or enzymes that alter bacterial 
15 cell structure, metabolism or physiology, and ultimately viability. Examples of such 
proteins present in the genome of Staphylococcus aureus bacteriophage 77 include 
orfl4 (deoxyuridine triphosphatase from bacteriophage T5), and orf!5 (sialidase). 
(These ORF identifications are as listed in provisional application 60/1 10,992.) Other 
examples include ORFs 9 and 12 of 5. aureus phage 44 AHJD, which encode the 
20 putative lysis functions found in many bacteriophages - a "holin" and an "amidase". 

In addition, it is well known that bacterial and eukaryotic viruses can usurp 
pathways from their host in order to use them to their advantage in blocking host 
cellular pathways upon infection. The phage can achieve this by 1) directly producing 
an inhibitor of a key host pathway (e.g. T7 gene 0.5 and 2), 2) directly producing a 
25 novel activity (e.g. T4 DNA polymerase), and 3) altering concentrations of cell 
components by producing similar functions (e.g. T4 transfer RNAs). The 
identification of sequence similarity between phage ORFs and bacterial host genome 
sequences will be highly indicative of such a mechanism. (Selected examples of such 
homologies are listed in Figure 4 of the provisional application 60/1 10,992 and 
30 include orf4 (homologous to autolysin), orf20 (hypothetical protein from 

Staphyloccus aureus) and orf29 (hypothetical protein from Staphyloccus aureus,)) 
These ORFs can be analyzed by a standard biochemical approach to directly test their 
inhibitor functions (e.g., as described below). 

Alternatively, a homology search may reveal that a given phage ORF is related 
35 to a protein present in the databases having an activity known to be inhibitory, (p.gr 
inhibitor of host RNA polymerase by E. coli bacteriophage T7. Such a finding would 
implicate the phage ORF product in a related activity. This will also suggest that a 
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new antimicrobial could be derived by a mimetic approach {e.g., peptidomimetic) 
imitating this function or by a small molecule inhibitor to the bacterial target of the 
phage ORF, or any steps in the relevant host metabolic pathway, e.g., high throughput 
screening of small molecule libraries. Selected examples of such similarity between 
ORFs of Staphyloccus aureus bacteriophage 77 and proteins with inhibitor functions 
for bacterial hosts are listed in Figure 4 of the provisional application 60/1 10,992. 
These include orf9 (similar to bacteriophage PI kilA function), and orf4 (autolysin of 
Staphylococcus aureus, amidase enzymatic activity). 

A reason for the biochemical study of individual ORFs for inhibitor function is 
that their expression or overexpression will block cellular pathways of the host, 
ultimately leading to arrest and/or inhibition of host metabolism. In addition, such 
ORFs can alter host metabolism in different ways, including modification of 
pathogenicity. Therefore, individual ORFs identified above are expressed, preferably 
overexpressed, in the host and the effect of this expression or overexpression on host 
metabolism and viability is measured. This approach can be systematically applied to 
every ORF of the phage, if necessary, and does not rely on the absolute identification 
of candidate ORFs by bioinformatics. Individual ORFs are resynthesized from the 
phage genomic DNA, e.g., by the polymerase chain reaction (PCR), preferably using 
oligonucleotide primers flanking the ORF on either side. These single ORFs are 
preferably engineered so that they contain appropriate cloning sites at their extremities 
to allow their introduction into a new bacterial expression plasmid, allowing 
propagation in a standard bacterial host such as E. coli, but containing the necessary 
information for plasmid replication in the target microbe such as S. aureus (hereafter 
referred to as shuttle vector). Shuttle vectors and their use are well known in the art. 

Such shuttle vectors preferably also contain regulatory sequences that allow 
inducible expression of the introduced ORF. As the candidate ORF may encode an 
inhibitor function that will eliminate the host, it is beneficial that it not be expressed 
prior to testing for activity. Thus, screening for such sequences when expressed in a 
constitutive fashion is less likely to be successful when the inhibitor is lethal. In the 
exemplary inducible system presented in Figure 1 A, IB, 2, and 7, regulatory 
sequences from the ars operon of S. aureus are used to direct individual ORF 
expression in S. aureus (or other bacteria in which the ars system is functional). The 
ars operon encodes a series of proteins which normally mediate the extrusion of 
arsenite and other trivalent oxyanions from the cells when they are exposed to such 
toxic substances in their environment. The operon encoding this detoxifying ^ 
mechanism is normally silent and only induced when arsenite-related compounds are 
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present. (Tauriainen, S. et al. (1997) App. Env. Microbe Vol. 63, No. 1 1, p. 4456- 
4461.) 

Therefore, individual phage ORFs can be expressed in S. aureus in an 
inducible fashion by adding to the culture medium non-toxic arsenite concentrations 
5 during the growth of individual S. aureus clones expressing such individual phage 
ORFs. Toxicity of the phage inhibitor ORF for the host is monitored by reduction or 
arrest of growth under induction conditions, as measured by optical density in liquid 
culture or after plating the induced cultures on solid medium. Subsequently, 
interference of the phage ORF with the host biochemical pathways ultimately leading 
10 to reduced or arrested host metabolism can be measured by pulse-chase experiments 
using radiolabeled precursors of either DNA replication, RNA transcription, or protein 
synthesis. Similar constructs can be made and used for other bacteria using well- 
known techniques. 

Those skilled in the art are familiar with a variety of other inducible systems 
1 5 which can also be used for the controlled expression of phage ORFs, including, for 
example, lactose (see e.g., Stratagene's LacSwitch™II system; La Jolla, CA) and 
tetracycline-based systems (see, e.g. Clontech's Tet On/Tet OfF M system; Palo Alto, 
CA). The arsenite-inducible system described is further depicted in Figures 1, 2 and 7. 
The selection or construction of shuttle vectors and the selection and use of 
20 inducible systems are well known and thus other shuttle vectors appropriate for other 
bacteria can be readily provided by those skilled in the art, e.g., for use in other 
bacterial species. 

Standard methodologies for expressing proteins from constructs, and isolating 
and manipulating those proteins, for example in cross-linking and affinity 

25 chromatography studies, may be found in various commonly available and known 
laboratory manuals. See, e.g., Current Protocols in Protein Science. John Wiley & 
Sons, Secaucus, N.J., and Maniatis, T. et al. (1989) Molecular Cloning: A Laboratory 
Manual. Cold Spring Harbor University Press, Cold Spring, N. Y. 

It has been found that certain phage or other viruses inhibit host cells, at least 

30 in part, by producing an antisense RNA which binds to and inhibits translation from a 
bacterial RNA seqeunce. Thus, in the case of potentially inhibitor RNA transcripts 
encoded by the phage genome, a strong indicator of a possible inhibitory function is 
provided by the identification of phage sequence which is the identical to or fully 
complementary (or with only a small percentage of mismatch, e.g., <10%, preferably 

35 less than 5%, most preferably less than 3%, to a bacterial sequence. This approaches 
convenient in the case of bacteria that have been essentially completely sequenced, as 
the comparison can be performed by computer using public database information. 
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The inhibitory effect of the transcript can be confirmed using expression of the 
phage sequence in a host bacterium. If needed, such inhibitory can also be tested by 
transfecting the cells with a vector that will transcribe the phage sequence to form 
RNA in such manner that the RNA produced will not be translated into a polypeptide. 
5 Inhibition under such conditions provides a strong indication that the inhibition is due 
to the transcript rather than to an encoded polypeptide. 

In an alternative, the expression of an ORF in a host bacterium is found to be 
inhibitory, but the inhibition is found to be due to an RNA product of the genomic 
coding region. For antisense inhibition, the sequence of the bacterial target nucleic 

10 acid sequence can be identified by inspection of the phage sequence, and the full 
sequence of the relevant coding region for the bacterial product can be found from a 
database of the bacterial genomic sequence or can be isolated by standard techniques 
(e.g., a clone in a genomic library can be isolated which contains the full bacterial 
ORF, and then sequenced). 

15 In either case, the identification of a target which is inhibited by an RNA 

transcript produced by a phage provides both the possible inhibition of bacteria 
naturally containing the same target nucleic acid sequence, as well as the ability to use 
the target sequence in screening for other types of compounds which will act directly 
on the target nucleic acid sequence or on a polypeptide product expressed or 

20 regulated, at least in part, by the target of the inhibitory phage RNA. 

In some cases it will be found that the target of an inhibitory phage RNA or 
protein has previously been found to be a target of an inhibitory phage RNA or 
protein has previously been found to be a target for an antibacterial agent. In such 
cases, the phage inhibitor can still provide useful information if it is found that the 

25 phage-encoded product acts at a different site than the previously identified 

antibacterial agent or inhibitor, Le. 9 acts at a phage-specific site. For many targets, 
action at a different site provides highly beneficial characteristics and/or information. 
For example, an alternate site of inhibitor action can at least partially overcome a 
resistance mechanism in a bacterium. As an illustration, in many cases, resistance is 

30 due, in large part, to altered binding characteristics of the immediate target to the 
antibacterial agent. The altered binding is due to a structural change which prevents 
or destabilizes the binding. However, the structural change is frequently quite local, 
so that compounds which bind at different local sites will b unaffected or affected to a 
much lesser degree. Indeed, in some cases the local sites will be on a different ... 

35 molecule and so may be completely unaffected by the local structural change creating 
resistance to the original agent(s). An example of resistance due to altered binding is 
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provided by methicillin-resistant Staphylococcus aureus, in which the resistance is 
due to an altered penicillin-binding protein. 

In other cases, a new site of action can have improved accessibility as 
compared to a site acted on by a previously identified agent. This can, for example, 
5 assist in allowing effective treatment at lower doses, or in allowing access by a larger 
range of types of compounds, potentially allowing identification of more potential 
active agents. 

Another advantage is that the structural characteristics of a different site of 
action will lead to identification and/or development of inhibitors with different 
10 structures and different pharmacological parameter. This can allow a greater range of 
possibilities when selecting an antibacterial agent. 

Yet further, different sites often produce different inhibitory characteristics in 
the target organism. This is commonly the case for multi-domain target proteins. 
Thus, inhibition targeting an alternate site can produce more efficacious action, e.g., 
1 5 faster killing, slower development of resistance, lower numbers of surviving cells, and 
different secondary effects (for example, different nutrient utilization). 

Staphylococcus aureus phage 77 

As indicated above, the present invention is concerned, in part, with the use of 

20 bacteriophage 77 coding sequences and the encoded polypeptides or RNA transcripts 
to identify bacterial targets for potential new antibacterial agents. 

As described, phage 77 ORFs 17, 19, 43, 102, 104, and 182 have been found 
to have bacteria inhibiting function. Identification of ORFs 17, 19, 43, 102, 104, and 
182 and products from the phage which inhibit the host bacterium both provides an 

25 inhibitor compound and allows identification of the bacterial target affected by the 
phage-encoded inhibitor. Such a target is thus identified as a potential target for 
development of other antibacterial agents or inhibitors and the use of those targets to 
inhibit those bacteria. As indicated above, even if such a target is not initially 
identified in a particular bacterium, such a target can still be identified if a 

30 homologous target is identified in another bacterium. Usually, but not necessarily, 
such another bacterium would be a genetically closely related bacterium. Indeed, in 
some cases, an inhibitor encoded by phage 77 ORF 17, 19, 43, 102, 104, or 182 can ■ 
also inhibit such a homologous bacterial cellular component. 

Possible bacterial target sequences are described herein by reference to sequence 
35 source sites. In preferred embodiments, the sequence encoding the target corresponds 
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to a S. aureus nucleic acid sequence available from numerous sources including S. 
aureus sequences deposited in GenBank, S. aureus sequences found in European 
Patent Application No. 971001 10.7 to Human Genome Sciences, Inc. filed January 7, 
1997, S. aureus sequences available from TIGR at 
5 htto://www.tigr.org/tdb/mdb/mdb.html. and S. aureus sequences available from the 
Oklahoma University S. aureus sequencing project at the following URL: 
http://www.genome.ou.edu/staph new.html . Such possible targets are particularly 
applicable to S aureus phages 77, 3A, 96, and 44 AHJD. 

The amino acid sequence of a polypeptide target is readily provided by 

10 translating the corresponding coding region. For the sake of brevity, the sequences 
are not reproduced herein. Also, in preferred embodiments, a target sequence 
corresponds to a S. aureus coding sequence corresponding to a sequence listed in 
Table 15 herein. The listing in Table 15 describes S. aureus sequences currently listed 
with GenBank. Again, for the sake of brevity, the sequences are described by 

15 reference to the database accession numbers instead of being written out in full herein. 
In cases where an entry for a coding region is not complete, the complete sequence 
can be readily obtained by routine methods, e.g., by isolating a clone in a phage host 
5. aureus genomic library, and sequencing the clone insert to provide the relevant 
coding region. The boundaries of the coding region can be identified by conventional 

20 sequence analysis and/or by expression in a bacterium in which the endogenous copy 
of the coding region has been inactivated and using subcloning to identify the 
functional start and stop codons for the coding region. 

Staphvloccus aureus phage 44 AHJD 
25 The present invention also can utilize the identification of naturally occuring 

DNA sequence elements within Staphylococcus aureus bacteriophage 44AHJD which 

encode proteins with antimicrobial activity. 

Such identification can utilize bioinformatics identification of specific proteins 

(ORFs) utilized by Staphylococcus aureus bacteriophage 44AHJD during the viral life 

30 cycle, resulting in a slowing or arrest of growth of the bacterial host, or in death, of 
the Staphylococcus aureus host including lysis of the infected bacteria. Thus, some of 
the bacteriophage 44AHJD DNA sequences encoding these proteins (ORFs) are 
predicted to encode antimicrobial functions. Information derived from these DNA 
sequences and translated ORFs can, in turn, be utilized to develop inhibitory _ 

35 compounds by peptidomimetics that can also function as antimicrobials. In addition, 
the identification of the host bacterial proteins that are targeted and inhibited by the 
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antimicrobial bacteriophage ORFs can themselves provide novel targets for drug 
discovery. 

The methodology described above is used to identify and characterize DNA 
sequences from Staphylococcus sp. bacteriophage 44 AHJD that have antimicrobial 
5 activity. As described in the Examples, the Staphylococcus aureus propagating strain 
(PS 44A), obtained from the Felix d'Herelle Reference Centre (#HER 1 101), was 
used as a host to propagate its phage 44AHJD, also obtained from the Felix d'Herelle 
Reference Centre (#HER 101). By sequencing, we found that bacteriophage 44 AHJD 
consists of 16,668 bp (Table 16) predicted to encode 73 ORFs greater than 33 amino 
1 0 acids (Tables 1 7 & 1 8). Computational analysis of the predicted protein products of 
Staphylococcus aureus bacteriophage 44AHJD identified homolgs in public sequence 
databases as listed inTable 19 and 20, along with the accompanying list of related 
proteins. 

From this analysis, it is apparent that 3 genes (ORF 3, 7, and 8) are related to 

15 structural proteins found in other bacteriophages. These include genes predicted to 
encode a tail protein (ORF 3), an upper collar/connector protein of the phage virion 
(ORF 7), and a lower collar protein (ORF 8). Bioinformatics has also identified one 
gene whose product is likely involved in phage DNA synthesis. One gene (ORF 1) 
shows significant homology to DNA polymerases of a number of bacteriophages, 

20 bacteria and fungi, and the product of this gene is likely responsible for replicating 
the genetic material of bacteriophage 44AHJD. ORF 2 encodes a protein with 
homology to the dinC gene of Bacillus subtilis that encodes a protein involved in 
teichoic acid biosynthesis. Teichoic acid is a polyphosphate polymer found in some, 
but not all, Gram positive organisms (and not in Gram negative organisms), where it 

25 is attached to the peptidoglycan layer. The phage protein may thus be involved in the 
synthesis of this material for incorporation into the cell wall, allowing enhanced lysis 
by the phage lysis enzymes or, as many enzymes can function in "reverse reactions", 
may be involved in its degradation allowing for penetration of the peptidoglycan and 
phage genome entry into the cell following adsorption. The similarity between 

30 Staphylococcus aureus bacteriophage 44 AHJD and £. coli phage T7 indicates that 
they may share similar mechanisms of replication and growth. Both phages belonjpo 
the Pododviridae Family of bacteriophages and are members of the "T7-like" Genus 
of this Family (Ackermann and DuBow; Vlth ICTV Report). 
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Two genes, ORF 9 and 12, were identified with the potential to encode 
antimicrobial protein products. The homology alignments are shown in Tables 19 and 
20. The predicted product of ORF 9 is related to a class of genes which encodes 
lysozyme-like functions, enzymes which cleave linkages in the mucopolysaccharide 
5 cell wall structure of a variety of micro-organisms, including that from the 

Staphylococcus aureus bacteriophage Twort. ORF 12 of Staphylococcus aureus 
bacteriophage 44AHJD shows homology to a set of lysis proteins from several 
bacteriophages. These lysis proteins are also referred to as holins, and represent 
phage-encoded lysis functions required for transit of the phage murein hydrolases 
10 (lysozyme) to the periplasm, where it can digest the cell wall and thus lyse the 
bacterium. 

Thus, in particular embodiments, the present invention provides a nucleic acid 
sequence isolated from Staphylococcus aureus bacteriophage 44AHJD comprising at 
least a portion of one of the genes described above with antimicrobial activity. For 

15 example, ORF 1 encodes a DNA polymerase function. This polymerase may utilize 
host-derived accessory proteins for its activity when replicating the phage template, 
sequestering such proteins from use by the bacterial polymerase, resulting in 
inhibition of DNA replication, cell division, and cell growth. Alternatively, ORF 9 
directly encodes a polypeptide with antimicrobial activity. ORF 9 is predicted to 

20 encode an amidase, a protein known to act as a cell wall degrading enzyme. ORF 12 
likely encodes a holin function required for transit of the phage amidase (gene 9 
product) to the periplasm. When this type of gene product from Bacillus phage phi 29 
(gene 14), was cloned in Escherichia coli, cell death ensued (Steiner et aL, 1993). 
Thus, production of proteins from Bacillus phage phi 29 gene 14 in E. coli resulted in 

25 cell death, whereas production of protein from Bacillus phage phi 29 gene 14 

concomitantly with the phi 29 lysozyme or unrelated murein-degrading enzymes led 
to lysis, suggesting that membrane-bound protein 14 induces a nonspecific lesion in 
the cytoplasmic membrane (Steiner et al., 1993). 

The present invention also provides the use of the Staphylococcus 

30 bacteriophage 44 AHJD antimicrobial ORFs or ORF products as pharmacological 
agents, either wholly or in part and derivatives, as well as the use of corresponding 
peptidomimetics, developed from amino acid or nucleotide sequence knowledge 
derived from Staphylococcus bacteriophage 44 AHJD killer ORFs. 
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Enterococcus phage 182 

Bacteriophage 182 was obtained from the Felix D'Herelle phage collection 

(Ste. Foy, Quebec) and infects Enterococcus sp. Group D. The genome of 

5 Enterococcus bacteriophage 182 consists of 17,833 bp (Table 21) and is predicted to 

encode 80 ORFs greater than 33 amino acids (Tables 22 and 23). Computational 

analysis of the predicted protein products of Enterococcus bacteriophage 182 was 

performed in order to identify protein products related to those deposited in public 

databases. Bacteriophage 182 protein products which detected sequences with 

10 significant sequence similarity in public databases are listed in Table 24 and 26, along 
with the accompanying list of related proteins. 

From this analysis, it is apparent that 5 genes (ORF 001, 004, 007, 009, and 
011) are related to structural proteins of several Bacillus phages - Bacillus 
bacteriophage PZA, phi-29, and B103. These include genes predicted to encode a tail 

15 protein (ORF 001), a head protein (ORF 004), and upper collar protein (ORF 007), a 
lower collar protein (ORF 009), and a pre-neck appendage protein (ORF 011). Two 
gene products are predicted to encode genes which direct phage morphogenesis 
these are ORF 005 and 019. 

Bioinformatics has also identified three genes whose products are likely 

20 involved in phage DNA synthesis. One gene, ORF 002 shows significant homology to 
DNA polymerases of a number of bacteriophages, and the product of this gene is 
likely responsible for replicating the genetic material of bacteriophage 182. ORF 006 
encodes a protein with homology to the encapsidation proteins of several other 
bacteriophages, including Bacillus phage phi-29 (PI 1014), PZA (P07541), and B103 

25 (X99260) and Streptococcus phage CP-1 (Z47794). These gene products catalyze the 
in vivo and in vitro genome-encapsidation reaction (Garvey et al., 1985). Proteins 
involved in genome packaging have been shown to have additional activities that 
affect biochemical reactions in other phages and their hosts. For example, the coat 
protein of the RNA bacteriophage MS2 interacts with viral RNA to translationally 

30 repress replicase synthesis (Pickett and Peabody, 1993). This protein-RNA interaction 
also plays a role in genome encapsidation, enveloping a single copy of the viral T 
genome in a protein shell composed of many molecules of coat protein. In addition, 
the bacteriophage A. terminase enzyme can be lethal to £. coli when expressed, 



WO 00/32825 



PCT/IB99/02040 



suggesting cleavage of packaging sites in the bacterial chromosome. Also present 
within bacteriophage 182 is a gene, ORF 010, that encodes a protein that is related to 
the terminal proteins of Bacillus phage Nf (P06812), Bacillus phage GA-1 (X96987) 
and Bacillus phage B103 (X99260). DNA terminal proteins are linked to the 5' ends 
5 of both strands of the genome and are essential for DNA replication playing a role in 
initial priming of DNA replication. The similarity between Enterococcus 
bacteriophage 182 and Bacillus phages phi-29, PZA, and B103 indicates that they 
may share similar mechanisms of replication and growth. Protein-primed DNA 
replication is a well described phenomenon, and in the phi-29-like phages, the ends of 
10 the DNA serve as origins and termini of replication (Gutierrez et al., 1986; Yoshikawa 
etal., 1985). 

There is also a gene (ORF 015) that encodes a protein showing homology to 
an early protein product of Bacillus bacteriophage PZA and the single-strand nucleic 
acid binding protein of bacteriophage B 103. 

15 Two genes, ORF 008 and 014, were identified with the potential to encode 

antimicrobial protein products. The homology alignments are shown in Tables 24 & 
26 and biochemical features of the predicted polypeptides shown in Table 25. The 
predicted product of ORF 008 is related to a class of genes which encodes lysozyme- 
like functions, enzymes which cleave linkages in the mucopolysaccharide cell wall 

20 structure of a variety of micro-organisms. ORF 014 of Enterococcus 182 shows 
homology to a set of lysis proteins from Bacillus bacteriophage phi-29, PZA, and 
B103. These lysis proteins are also referred to as holins and represent phage encoded 
lysis functions required for transit of the phage murein hydrolases (lysozyme) to the 
periplasm, where it can digest the outer cell wall and thus lyse the bacterium. 

25 Thus, the present invention provides a nucleic acid sequence obtained from 

Enterococcus bacteriophage 182 comprising at least a portion of a phage 182 ORF, 
preferably an inhibitory ORF, and more preferably at least a portion of one of the 
genes described above with anti-microbial activity. For example, ORF 002 encodes a 
DNA polymerase function. This polymerase may utilize host-derived accessory 

30 proteins for its activity when replicating the phage template, sequestering such 
proteins from use by the bacterial polymerase, resulting in inhibition of DNA 
replication, cell division, and cell growth. Alternatively, ORFs 008 or 014 directly 
encode polypeptides with anti-microbial activity. ORF 008 is predicted to encode an 
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autolytic lysozyme, a protein known to have anti-microbial activity (Martin et al, 
1998). ORF 014 likely encodes a holin function required for transit of the phage 
murein hydrolases to the periplasm. When the related product from Bacillus phage phi 
29 (gene 14), was cloned in Escherichia coli, cell death ensued (Steiner et aL, 1993). 
5 Thus, production of proteins from Bacillus phage phi 29 gene 14 in & coli resulted in 
ceil death, whereas production of protein from Bacillus phage phi 29 gene 14 
concomitantly with the phi 29 lysozyme or unrelated murein-degrading enzymes led 
to lysis, suggesting that membrane-bound protein 14 induces a nonspecific lesion in 
the cytoplasmic membrane (Steiner et a/., 1993). 

10 The present invention also provides the use of the Enterococcus bacteriophage 

182 anti-microbial ORFs as pharmacological agents, either wholly or in part and 
derivatives, as well as the use of corresponding peptidomimetics, developed from 
amino acid or nucleotide sequence knowledge derived from Enterococcus 
bacteriophage 182 killer ORFs. This can be done where the structure of the 

15 peptidomimetic compound corresponds to the structure of the active portion of a 
product of an ORF. In this analysis, the peptide backbone is transformed into a carbon 
based hydrophobic structure that can retain cytostatic or cytocidal activity for the 
bacterium. This is done by standard medicinal chemistry methods, measuring growth 
inhibition of the various molecules in liquid cultures or on solid medium. These 

20 mimetics also represent lead compounds for the development of novel antibiotics. In 
this context, "corresponds" means that the peptidomimetic compound structure has 
sufficient similarities to the structure of the active portion of a product of one of the 
Enterococcus ORFs listed, that the peptidomimetic will interact with the same 
molecule as the product of the ORF, and preferably will elicit at least one cellular 

25 response in common which relates to the inhibition of the cell by the phage protein. 

To validate the identity of an ORF as a killer ORF, it is preferably expressed 
in the host or other test bacterial organism and the effect of this expression on 
bacterial growth and replication is assessed. Therefore, all individual ORFs identified 
herein, e.g., those identified above, can be expressed, preferably overexpressed, in a 

30 suitable host bacterium e.g., a host Enterococcus and the effect of this expression or 
overexpression on host metabolism and viability can be measured. - 

Individual ORFs can be resynthesized from the phage genomic DNA by the 
polymerase chain reaction (PCR) using oligonucleotide primers flanking the ORF on 
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either side. Those skilled in the art are familiar with the design and synthesis of 
appropriate primer sequences. These single ORFs are preferably engineered so that 
they contain appropriate cloning sites at their extremities to allow their introduction 
into a new bacterial expression plasmid, allowing propagation in a standard bacterial 
5 host such as E. colU but containing the necessary information for plasmid replication 
in the target microbe, Enterococcus sp. (hereafter referred to as a shuttle vector). 

This shuttle vector also preferably contains regulatory sequences that allow 
inducible expression of the introduced ORF. As the candidate ORF may encode a 
killer function that will eliminate the host, it is highly advantageous that it not be 

10 expressed (or at least not expressed at a substantial level) prior to testing for activity; 
thus screening for such sequences in a constitutive fashion is less likely to be 
successful (lethality). In an example presented in Fig. 7, regulatory sequences from 
the ars operon are used to direct individual ORF expression in Enterococcus. The ars 
operon encodes a series of proteins which normally mediate the extrusion of arsenite 

15 and several other trivalent oxyanions from the cells when they are exposed to such 
toxic substances in their environment. The operon encoding this detoxifying 
mechanism is normally silent and only induced when arsenite-related compounds are 
present. 

Therefore, individual phage ORFs can be expressed in Enterococcus or other 
20 suitable host in an inducible fashion by adding to the culture medium non-toxic 
arsenite concentrations during the growth of individual Enterococcus (or other host 
cells) clones expressing such individual phage ORFs. Toxicity of the phage killer 
ORF for the host is monitored by reduction or arrest of growth under induction 
conditions, as measured by optical density in liquid culture or after plating the 
25 induced cultures on solid medium. Subsequently, interference of the phage ORF with 
the host biochemical pathways ultimately leading to reducing or arresting host 
metabolism can be measured by pulse chase experiments using radiolabeled 
precursors of either DNA replication, RNA transcription, or protein synthesis. 

Of course, other inducible regulatory sequences (e.g., promoters, operators, 
30 etc.) may be used (e.g., systems using positive induction of expression or systems t 
using release of repression). A variety of such systems are known to thos©-skiTled in 
the art and can be utilized in the present invention. 



Patent provided by Sughrue Mion, PLLC - http://www.sughrue.com 



WO 00/32825 



PCT/IB99/02040 



53 

Nucleic acid sequences of the present invention can be isolated using a method 
similar to those described herein or other methods known to those skilled in the art. 
In addition, such nucleic acid sequences can be chemically synthesized by well- 
known methods. Having the phage 182 ORFs, e.g., anti-bacterial ORFs of the present 
5 invention, portions thereof, or oligonucleotides derived therefrom as described, other 
anti-microbial sequences from other bacteriophage sources can be identified and 
isolated using methods described here or other methods, including methods utilizing 
nucleic acid hybridization and/or computer-based sequence alignment methods. 

The invention also provides bacteriophage anti-microbial DNA segments from 

10 other phages based on nucleic acids and sequences hybridizing to the presently 
identified inhibitory ORF under high stringency conditions or sequences which are 
highly homologous. The bacteriophage anti-microbial DNA segment from 
bacteriophage 1 82 can be used to identify a related segment from another unrelated 
phage based on stringent conditions of hybridization or on being a homolog based on 

15 nucleic acid and/or amino acid sequence comparisons. As with the phage 182 

inhibitory sequences, such homologous coding sequences and products can be used as 
antimicrobials, to construct active portions or derivatives, to construct 
peptidomimetics, and to identify bacterial targets. 

Enterococcus sequences are listed in Table 27 by accession number, providing 

20 identification of possible targets of Enterococcus phage inhibitory ORF products, eg., 
from phage 182. 

Streptococcus pneumoniae 

As indicated in the Summary above, the present invention is concerned 

25 with the use of Streptococcus sp. bacteriophage Dp-1 coding sequences and the 
encoded polypeptides or RNA transcripts to identify bacterial targets for potential new 
antibacterial agents. 

Streptococcus pneumoniae is an important cause of community-acquired 
pneumonia and a major cause of otitis media, sinusitis, and meningitis in children and 

30 adults. In Spain and other Mediterranean countries, the majority of 5. pneumoniae are 
relatively resistant to penicillin (Klugman, 1990; Fenoll et al., 1991; Jorgenseirefal., 
1990). These strains also have decreased susceptibility to broad-spectrum 
cephaloporins, which are frequently used in the empiric treatment of meningitis and 
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other serious invasive bacterial infections. High-level resistance of pneumococci has 
been encountered in Hungary where 70% of children who were colonized with S. 
pneumoniae carried penicillin resistant strains that were also resistant to tetracycline, 
erythromycin, trimethoprim/sulfamethoxazole, and 30% resistant to chloramphenicol 
5 (Neu, 1992). The resistance of pneumococci to macrolides such as erythromycin 
averages 20-25% in France, -20% in Japan, and <10% in Spain (Neu, 1992). 

The antimicrobial susceptibilities and distribution of serotypes of the 42 
isolates of S. pneumoniae in southern Taiwan from invasive infections have been 
recently determined (Hseuh et ah, 1996). Resistance rates among these isolates were: 
10 erythromycin, 61.9%; clindamycin, 47.6%; chloramphenicol, 19%; and tetracycline, 
73.8%. Resistance to three or more classes of antibiotics was found in 33.3% of the 
isolates. Bacteremic pneumonia and primary bacteremia accounted for 64.3% of the 
infections and mortality was 42.6%. Given the severity of these infections despite 
adequate antibiotic therapy, there is clearly a need for introduction of new therapeutic 

1 5 options to prevent mortality due to invasive S. pneumoniae infections. 

Pneumococcal phages belong to four families and they present a great variety 
in morphology, including lytic and temperate phages (for a review, see Garcia et al., 
1997). Examples of lytic phages are Cp-1 and Dp-1, whereas examples of temperate 
phages are HB-3, EJ-1, and HB-746. The complete nucleotide sequence and 

20 functional organization of Cp-1 has been reported (Martin et al., 1996). Cp-1 has a 
19,345 bp double-stranded DNA genome, with a terminal protein covalently linked to 
its 5* ends, that replicates by a protein primed mechanism. The phage contains 29 
ORFs, 23 on one strand and 6 on the opposite. When these predicted proteins were 
compared to sequences compiled in GenBank EMBL databases, to ORFs showed 

25 significant similarity to proteins of bacteriophage 29 that infects B. subtilis (Martin et 
al, 1996). The similar proteins corresponded to those involved in DNA replication 
(terminal protein and DNA polymerase), structural and morphogenic proteins (major 
head, collar, connector, tail, and encapsidation proteins), and proteins involved in lysis 
function (holin and lysozyme). In its strategy of lysis, the holin gene product inserts 

30 itself into the cell membrane, allowing access of the lysozyme to the peptidoglycan.. 
Expression of the Cp-1 holin protein in £. coli results in cell death after 2- hours of 
induction, but did not lead to lysis (Garcia et al., 1997). Cells harboring a plasmid 
construction with holin and lysozyme genes together did lyse after induction and the 
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viability loss was similar to that of the culture expressing holin alone. Cloning of 
these lytic genes in S. pneumoniae showed that both genes had the same effect as in E. 
colL That is, holin itself did not lyse the culture but the viability loss was noticeable, 
whereas both holin and lysozyme together were capable of lysing M31, an amidase 
5 deleted mutant (Garcia et al., 1 997). 

Recently, a small portion (-4 kbp) of a second S. pneumoniae phage, Dp-1, 
has been sequenced (Sheehan et al., 1997). This portion contains the genes coding for 
the lytic system (Sheehan et al., 1997) and shows a modular organization similar to 
that described for Cp-1 . However, in this case, a single chimeric protein appears to be 
10 made in which the N-terminal domain is highly similar to that of the murein hydrolase 
coded by a gene found in the phage BK5-T that infects Lactococcus lactis, and the C- 
terminal domain is homologous to holins. Thus, both functions appear to have been 
combined in a novel chimeric protein. 

Bacteriophage Dp-1 was obtained from Dr. P. Garcia (Departamento de 

15 Microbiologia Molecular, Centre de Departamento de Investigaciones Biologicas, 
Consejo Superior de Investigaciones Cientificas, Velazquez, Madrid, Spain). We 
found that Dp-1 has a double-stranded DNA genome of 56,506 bp, predicted to 
encode 85 ORFs greater than 33 amino acids and with upstream Shine-Dalgarno 
motifs for translation initiation (Tables 28 & 30, and Fig. 6). Computational analysis 

20 of the predicted protein products of Streptococcus bacteriophage Dp-1 protein 
products, which detected homologs in public databases, are listed inTable 3 1 , along 
with the accompanying list of related proteins. 

From this analysis, it is apparent that several predicted genes of Dp-1 encode 
polypeptides that are related to structural proteins. ORFs 001, 002, 004, and 030 are 

25 predicted to encode tail proteins, minor structural proteins, and minor capsid proteins 
(Table 31). We also note the identification of several gene products that are likely 
involved in DNA synthesis. These include ORF 3 which encodes DNA polymerase, 
ORF 8 which encodes a SWI/SNF helicase-related protein, ORF 10 encodes a protein 
showing homology to recA, and ORF 13 encodes a dnaZX-like ORF. 

30 In E. coli y RapA encodes an RNA polymerase (RNAP)-associated protein with , 

ATPase activity and which is a homolog of the eukaryotic SWI/SNF family, a set of 
proteins whose members are involved are involved in transcription activation, 
nucleosome remodeling, and DNA repair. RapA forms a stable complex with RNAP, 
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as if it were a subunit of RNAP and it is possible that the ORF 8 product behaves 
similarly or in a dominant-negative fashion to inhibit the activity of RapA. Mutation 
of the essential E. coli dnaZX results in a block in DNA chain elongation during 
replication (Maki et al., 1988). The dnaZX gene has only one open reading frame for 
5 a 71-kDa polypeptide from which the two distinct DNA polymerase III holoenzyme 
subunits, tau (71 kDa) and gamma (47 kDa), are produced. The tau subunit is the 
precursor of the gamma subunit, and the gamma subunit is produced by a -1 
frameshift causing early termination of translation (Tsuchihashi et al., 1990). These 
proteins show single-strand DNA binding properties that is ATPase (and dATPase) 

10 dependent and are thought to increasing the processivity of the core DNA polymerase 
enzyme (Lee et al., 1987). 

There are several Dp-1 ORFs which encode proteins predicted to play a role in 
cellular metabolic pathways. These include polypeptides involved in coenzyme PQQ 
synthesis (ORFs 20, 29, 38). Pyrrolo-quinoline quinone (PQQ) is the non-covalently 

15 bound prosthetic group of many quinoproteins catalysing reactions in the periplasm of 
Gram-negative bacteria. Most of these involve the oxidation of alcohols or aldose 
sugars. Interestingly, ORFs 20, 29, and 30 also show homology to the exoenzyme S 
regulon (Frank, 1997). Proteins encoded by the P. aeruginosa exoenzyme S regulon 
may be involved in a contact-mediated translocation mechanism to transfer anti-host 

20 factors directly into eukaryotic cells disrupting eukaryotic signal transduction through 
ADP-ribosylation (Frank, 1997). 

There is also a protein with similarity to GTP cyclohydrolase I (ORF 21) and 
ORF 41 which shows homology to dUTPase (Table 31). GTP cyclohydrolase I is an 
enzyme that catalyzes the first reaction in the pathway for the biosynthesis of the 

25 pteridine, a cofactor of the monooxygenases of the aromatic amino acids. Disruption 
of the homologous gene in Saccharomyces cerevisiae leads to a recessive conditional 
lethality due to folinic acid auxotrophy, that can be complemented with the 
mammalian or bacterial GTP cyclohydrolase I enzymes (Nardese et al., 1996; Mancini 
etal., 1999). 

30 ORF 16 shows high homology to autolysin. This region of the phage sequence 

was previously reported (Sheehan et al., 1997) and encompasses - 4 kbp of our 
sequence. The sequence published by (Sheehan et al., 1997) is shown in Table 32. 

Thus, the present invention provides a nucleic acid sequence obtained from 
Streptococcus bacteriophage Dp-1 comprising at least a portion of a phage Dp-l.QRFy - 

35 preferably an inhibitory ORF, and more preferably at least a portion of one of the 
genes described above with anti-microbial activity. For example, ORF 013 encodes a 
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protein with homology to the gamma subunit of DNA polymerase (dnaX gene). This 
protein may act in a dominant-negative fashion to sequester the host DNA polymerase 
for its own replication, thus inhibiting host DNA replication. The dnaX gene product 
is essential for E. coli replication (Kodaira et al., 1983). 

5 In certain preferred embodiments of the present invention, the bacterial target of 

a bacteriophage inhibitor ORF product, e.g., an inhibitory protein or polypeptide, is 
encoded by a Streptococcus nucleic acid coding sequence from a host bacterium for 
bacteriophage Dp-1. As above, possible target sequences are described herein by 
reference to sequence source sites. The sequence encoding the target preferably 
10 corresponds to a Streptococcus nucleic acid sequence available from The Institute for 
Genomic Research (TIGR), or available from GenBank or other public database. The 
TIGR Streptococcus sequences are publicly available at The Institute for Genomics 
Research at URL: http://www.tigr.org 

The amino acid sequence of a polypeptide target is readily provided by 

15 translating the corresponding coding region. For the sake of brevity, the sequences 
are not reproduced herein. Also, in preferred embodiments, a target sequence 
corresponds to a Streptococcus pneumoniae coding sequences corresponding to a 
sequence listed in Table 33 herein. Sequences for other Streptococcal species are also 
available from TIGR andVor from GenBank. The listing in Table 33 describes 

20 Streptococcus sequences currently deposited in GenBank. Again, for the sake of 
brevity, the sequences are described by reference to the GenBank entries instead of 
being written out in full herein. In cases where the TIGR or GenBank entry for a 
coding region is not complete, the complete sequence can be readily obtained by 
routine methods, e.g., by isolating a clone in a phage Dp-1 host Streptococcus sp. 

25 genomic library, and sequencing the clone insert to provide the relevant coding 
region. The boundaries of the coding region can be identified by conventional 
sequence analysis and/or by expression in a bacterium in which the endogenous copy 
of the coding region has been inactivated and using subcloning to identify the 
functional start and stop codons for the coding region. 

30 In the various aspects of this invention involving Dp-1 sequences, preferably the 

sequence is preferably not contained in the sequence described in Sheehan et al., 1997 
(Table 32). 

Validating Identified Inhibitory Phage ORFs ~ *" 

35 A fifth step involves validating the identified phage inhibitor ORF by 

independent methods, and delineating further possible smaller segments of the ORFs 
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that have inhibitory activity. Several methods exist to validate the role of the 
identified ORF as an inhibitor ORF. 

One example utilizes the creation of a mutant variant of the phage ORF in 
which the candidate ORF carries a partial or complete loss-of-fiinction mutation that 
5 is measurable as compared with the non-mutant ORF. Comparison of the effects of 
expression of the loss of function mutant with the normal ORF provides confirmation 
of the identification of an inhibitor ORF where the loss-of-fimction mutant provides a 
measurably lower level of inhibition, preferably no inhibition. The loss of function 
may be conditional, e.g., temperature sensitive. 

10 Once validation of the inhibitor ORF is achieved, a bi-directional deletion 

analysis can be carried out using the same experimental system to identify the 
minimal polypeptide segment that has inhibitor activity. This may be carried out by a 
variety of means, e.g., by exonuclease or PCR methodologies, and is used to 
determine if a relatively small segment of the ORF (i.e., the product of the ORF) still 

15 possesses inhibitory activity when isolated away from its native sequence. If so, a 
portion of the ORF encoding this "active portion" can be used as a template for the 
synthesis of novel anti-microbial agents and further allowing derivation of the peptide 
sequence, e.g., using modified peptides and/or peptidomimetics. 

In creation of certain peptidomimetics, the peptide backbone is transformed 

20 into a carbon-based hydrophobic structure that can retain inhibitor activity against the 
bacterium. This is done by standard medicinal chemistry methods, typically 
monitored by measuring growth inhibition of the various molecules in liquid cultures 
or on solid medium. These mimetics can also represent lead compounds for the 
development of novel antibiotics. 

25 Recently, a major effort has been undertaken by the pharmaceutical industry 

and their biotechnology partners for the sequencing of bacterial pathogen genomes. 
The rationale is that the systematic sequencing of the genome will identify all of the 
bacterial proteins and therefore this proteome will be the target for designing novel 
inhibitor antibiotics. Although systematic, this approach has several major problems. 

30 The first is that analysis of primary amino acid sequences of bacterial proteins does 
not immediately reveal which protein will be essential for viability of the bacterium, 
and target validation is thus a major issue. The second problem is one of redundancy, 
as several biochemical pathways are either structurally duplicated in bacteria 
(different isoforms of the same enzyme), or functionally duplicated by the presence of. ? 

35 salvage pathways in the event of a metabolic block in one pathway (different 

nutritional conditions). The third is that even a valid target may not be structurally or 
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functionally amenable to inhibition by small molecules because of inaccessibility 
(sequestration of target). 

Therefore, there is considerable interest within the pharmaceutical and 
biotechnology industry in identifying key targets for drug discovery amongst the mass 
5 of novel targets generated by large-scale genomic sequencing projects. 

On the other hand, and underscoring the instant invention, the phages herein 
described have, over millions of years, evolved specific mechanisms to target such 
key biochemical pathways and proteins. In the few cases where inhibition by phages 
has been elucidated (e.g., see ref. 3), such bacterial targets are invariably rate-limiting 
10 in their respective biochemical pathways, are not redundant, and/or are readily 
accessible for inhibition by the phage (or by another inhibitory compound). 
Therefore, the sixth step of this invention involves identifying the host biochemical 
pathways and proteins that are targeted by the phage inhibitory mechanisms. 

15 Identifying. Validating, and Characterizing Bacterial Host Target Proteins and 
Affected Pathways 

A rationale for this step is that the inhibitor ORF product from the phage 
physically interacts with and/or modifies certain microbial host components to block 
their function. Exemplary approaches which can be used to identify the host bacterial 

20 pathways and proteins that interact with, and preferably also are inhibited by, phage 
ORF product(s) are described below. 

One approach is a genetic screen to determine physiological protein.protein 
interaction, for example, using a yeast two hybrid system. In this assay, the phage 
ORF is fused to the carboxyl terminus of the yeast Gal4 activation domain II (amino 

25 acids 768-881) to create a bait vector. A cDNA library of cloned S. aureus sequences 
which have been engineered into a plasmid where the S. aureus sequences are fused to 
the DNA binding domain of Gal4 is also generated. These plasmids are introduced 
alone, or in combination, into yeast strain Y190 - previously engineered with 
chromosomally integrated copies of the E. coli lacZ and the selectable HIS3 genes, 

30 both under Gal4 regulation (Durfee, T., Becherer, K„ Chen, P.-L., Yeh, S.-H., Yang, 
Y. f Kilburn, A.E., Lee, W.-R, and Elledge, S J. (1993). Genes & Dev. 7, 555-569). If 
the two proteins expressed in yeast interact, the resulting complex will activate 
transcription from promoters containing Gal4 binding sites. A lacZ and His3 gene, 
each driven by a promoter containing Gal4 binding sites, have been integrated into the^ , 

35 genome of the host yeast system used for measuring protein-protein interactisnsTSuch 
a system provides a physiological environment in which to detect potential protein 
interactions. This system has been extensively used to identify novel protein-protein 
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interaction partners and to map the sites required for interaction (for example, to 
identify interacting partners of translation factors (Qiu, H., Garcia-Barrio, M.T., and 
Hinnebusch, A.G. (1998). Mol & Cell Biology 18, 2697-271 1), transcription factors 
(Katagiri, T., Saito,H., Shinohara, A., Ogawa,H., Kamada,N., Nakamura ,Y., and 
5 Miki, Y. (1998). Genes, Chromosomes & Cancer 21 , 217-222), and proteins involved 
in signal transduction (Endo, T.A., Masuhara, M., Yokouchi, M., Suzuki, R., 
Sakamoto, H., Mitsui, K., Matsumoto, A., Tanimura, S., Ohtsubo, M., Misawa, H., 
Miyazaki, T., LeonorN., Taniguchi, T., Fujita,T., Kanakura, Y., Komiya,S., and 
Yoshimura, A. Nature. 387, 921-924). This approach has also been used in many 
10 published reports to identify interaction between mammalian viral and mammalian 
cell proteins. 

For example, the non-structural protein NS1 of parvovirus is essential for viral 
DNA amplification and gene expression and is also the major cytopathic effector of 
these viruses. A yeast two-hybrid screen with NS1 identified a novel cellular protein 

1 5 of unknown function that interacts with NS- 1 , called SGT, for small glutamine-rich 
tetratricopeptide repeat (TPR)-containing protein (Cziepluch C. Kordes E. Poirey R. 
GrewenigA. Rommelaere, J, and JauniauxJC. (1998)7 Virol 72, 4149-4156). In 
another screen, the adenovirus E3 protein was recently shown to interact with a novel 
tumor necrosis factor alpha-inducible protein and to modulate some of the activities of 

20 E3 (Li Y. Kang J. and Horwitz M.S. (1998). Mol & Cell Biol 18, 1601-1610), In yet 
another recent screen, the herpes simplex virus 1 alpha regulatory protein ICP0 was 
found to interact with (and stabilize) the cell cycle regulator cyclin D3 (Kawaguchi Y. 
Van Sant C. and Roizman B. (1997). J Virol 71,7328-7336). 

Another two-hybrid system for identifying proteimprotein interactions is 

25 commercially available from STRATEGENE™ as the CYTO-TRAP™ system 
(Chang et aL, Strategies Newsletter 1 1(3), 65-68 (1998)(from Stratagene)). The 
system is a yeast-based method for detecting proteinrprotein interactions in vivo, using 
activation of the Ras signal transduction cascade by localizing a signal pathway 
component, human Sos (hSos), to its activation site in the yeast plasma membrane. 

30 The system uses a temperature-sensitive Saccharomyces cerevisiae mutant, strain 
cdc25H, which contains a point mutation at amino acid residue 1328 of the cdc25 
gene. This gene encodes a guanyl nucleotide exchange factor which binds and 
activates Ras, leading to cell growth. The mutation in the cdc25 gene prevents host 
growth at 37°C, but at a permissive temperature of 25°C, growth is normal. The 

35 system utilizes the ability of (hSos) to complement the cdc25 defect and activateThe 
yeast Ras signaling pathway. Once (hSos) is expressed and localized to the plasma 
membrane, the cdc25H yeast strain grows at 37°C. Localizing hSos to the plasma 
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membrane occurs through a protein:protein interaction. A protein of interest, or bait, 
is expressed as a fusion protein with hSos. The library, or target proteins are 
expressed with the myristylation membrane-localization signal. The yeast cells are 
then incubated under restrictive conditions (37°C). If the bait and the target protein 
5 interact, the hSos protein is recruited to the membrane, activating the Ras signaling 
pathway and allowing the cdc25H yeast strain to grow at the restrictive temperature. 

The protein targets of phage inhibitory ORFs can also be identified using 
bacterial genetic screens. One approach involves the overexpression of a phage 
inhibitory protein in mutagenized bacterial host species, followed by plating the cells 

10 and searching for colonies that can survive the antimicrobial activity of the inhibitory 
ORF. These colonies are then grown, their DNA extracted, and cloned into an 
expression vector that contains a replicon of a different incompatibility group from 
the plasmid expressing the original ORF, This library is then introduced into a wild- 
type host bacterium in conjunction with an expression vector driving synthesis of the 

15 phage ORF, followed by selection for surviving bacteria. Thus, bacterial DNA 

fragments from the survivors presumably contain a DNA fragment from the original 
mutagenized host bacterial genome that can protect the cell from the antimicrobial 
activity of the inhibitory phage ORF. This fragment can be sequenced and compared 
with that of the bacterial host to determine in which gene the mutation lies. This 

20 approach enables one to determine the targets and pathways that are affected by the 
killing function. 

A second approach is based on identifying proteimprotein interactions 
between the phage ORF product and bacterial S. aureus, e.g., proteins using a 
biochemical approach based, for example, on affinity chromatography. This approach 

25 has been used, for example, to identify interactions between lambda phage proteins 
and proteins from their E. coli host (Sopta, M., Carthew, R.W., and Greenblatt, J. 
(1985) 7. Biol Chem. 260, 10353-10369). The phage ORF is fused to a peptide tag 
{e.g. glutathione-S-transferase ("GST'), 6xHIS, ("HIS") and/or calmodulin binding 
protein ("CPB")) within a commercially available plasmid vector that directs high 

30 level expression on induction of a suitably responsive promoter driving the fusion's 
expression. The translated fusion protein is expressed in E. coli, purified, and 
immobilized on a solid phase matrix via, for example the tag. Total cell extracts from 
the host bacterium, e.g., S. aureus, are then passed through the affinity matrix 
containing the immobilized phage ORF fusion protein; host proteins retained on the 

35 column are then eluted under different conditions of ionic strength, pH, detergents 
etc., and characterized by gel electrophoresis and other techniques. Appropriate 
controls are run to guard against nonspecific binding to the resin. Target proteins thus 
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recovered should be enriched for the phage protein/peptide of interest and are 
subsequently electrophoretically or otherwise separated, purified, sequenced, or 
biochemically analyzed. Usually sequencing entails individual digestion of the 
proteins to completion with a protease (e.g. -trypsin), followed by molecular mass and 
5 amino acid composition and sequence determination using, for example, mass 
spectrometry, e.g., by MALDI-TOF technology (Qin, J., Fenyo, D., Zhao, Y., Hall, 
W.W., Chao, D.M., Wilson, C.J., Young, R.A. and Chait, B.T. (1997). Anal. Chem. 
69, 3995-4001). 

The sequence of the individual peptides from a single protein are then 
1 0 analyzed by the bioinformatics approach described above to identify the S. aureus 
protein interacting with the phage ORF. This analysis is performed by a computer 
search of the S. aureus genome for an identified sequence. Alternatively, all tryptic 
peptide fragments of the S. aureus genome can be predicted by computer software, 
and the molecular mass of such fragments compared to the molecular mass of the 

1 5 peptides obtained from each interacting protein eluted from the affinity matrix. The 
responsible gene sequence can be obtained, for example by using synthetic degenerate 
nucleic acid sequences to pull out the corresponding homologous bacterial sequence. 
Alternatively, antibodies can be generated against the peptide and used to isolate 
nascent peptide/mRNA transcript complexes, from which the mRNA can be reverse 

20 transcribed, cloned, and further characterized using the procedures discussed herein. 

A variety of other binding assay methods are known in the art and can be used 
to identify interactions between phage proteins and bacterial proteins or other bacterial 
cell components. Such methods that allow or provide identification of the bacterial 
component can be used in this invention for identifying putative targets. 

25 Validation of the interaction between the phage ORF product and the bacterial 

proteins or other components can be obtained by a second independent assay (e.g., 
co-immunoprecipitation or protein-protein crosslinking experiments (Qiu, H., Garcia- 
Barrio, M.T., and Hinnebusch, A.G. ( 1 998). Mol & Cell Biology 1 8, 2697-27 1 1 ; 
Brown, S. and Blumenthal, T. (1976). Proc. Natl Acad. Sci. USA 73, 1131-1135)). 

30 Finally, the essential nature of the identified bacterial proteins is preferably 

determined genetically by creating a constitutive or inducible partial or complete loss- 
of-function mutation in the gene encoding the identified interacting bacterial protein. 
This mutant is then tested for bacterial survival and replication. 

The protein target of the phage inhibitor function can also be identified.using a, , 

35 genetic approach. Two exemplary approaches will be delineated here. The firsf ~ 
approach involves the overexpression of a predetermined phage inhibitor protein in 
mutagenized host bacteria, e.g., S. aureus, followed by plating the cells and searching 
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for colonies that can survive the inhibitor. These colonies will then be grown, their 
DNA extracted and cloned into an expression vector that contains a replicon of a 
different incompatibility group, and preferably having a different selectible marker 
than the plasmid expressing the phage inhibitor. Thus, host DNA fragments from the 
5 mutant that can protect the cell from phage ORF inhibition can be sequenced and 
compared with that of the bacterial host to determine in which gene the mutation lies. 
This approach allows rapid determination of the targets and pathways that are affected 
by the inhibitor. 

Alternatively, the bacterial targets can be determined in the absence of 
10 selecting for mutations using an approach known as "multicopy suppression". In this 
approach, the DNA from the wild type host is cloned into an expression vector that 
can coexist, as previously described, with one containing a predetermined phage 
inhibitor. Those plasmids that contain host DNA fragments and genes that protect the 
host from the phage inhibitor can then be isolated and sequenced to identify putative 

1 5 targets and pathways in the host bacteria. 

Regardless of the specific mode of identification, screening assays may 
additionally utilize gene fusions to specific "reporter genes" to identify a bacterial 
gene(s) whose expression is affected when the host target pathway is affected by the 
phage inhibitor. Such gene fusions can be used to search a number of small molecule 

20 compounds for inhibitors that may affect this pathway and thus cause cell inhibition. 
This approach will allow the screening of a large number of molecules on petri dishes 
or 96-well format by monitoring for a simple color change in the bacterial colonies. 
In this manner, we can validate host targets and classes of compounds for further 
study and clinical development. These inhibitors also represent lead compounds for 

25 the development of other antibiotics. 

Bioinformatics and comparative genomics are preferably then applied to the 
identified bacterial gene products to predict biochemical function. The biochemical 
activity of the protein can be verified in vitro in cell free assays or in vivo in intact 
cells. In vitro biochemical assays utilizing cell-free extracts or purified protein are 

30 established as a basis for the screening and development of inhibitors. 

These inhibitors, preferably small molecule inhibitors, may comprise peptides, 
antibodies, products from natural sources such as fungal or plant extracts or small 
molecule organic compounds. In general, small molecule organic compounds are 
preferred. These compounds may, for example, be identified within large compound 

35 libraries, including combinatorial libraries. For example, a plurality of compounds, 
preferably a large number of compounds can be screened to determine whether any of 
the compounds binds or otherwise disrupts or inhibits the identified bacterial target. 
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Compounds identified as having any of these activities can then be evaluated further 
in cell culture and/or animal model systems to determine the pharmacological 
properties of the compound, including the specific anti-microbial ability of the 
compound. 

5 For mixtures of natural products, including crude preparations, once a 

preparation or fraction of a preparation is shown the have an anti-microbial activity, 
the active substance can be isolated and identified using techniques well known in the 
art, if the compound is not already available in a purified form. 

Identified compounds possessing anti-microbial activity and similar 
1 0 compounds having structural similarity can be further evaluated and, if necessary, 
derivatized according to synthesis and/or modification methods available in the art 
selected as appropriate for the particular starting molecule. 

Derivatization of identified anti-microbials 

15 In cases where the identified anti-microbials above might represent peptidal 

compunds, the in vivo effectiveness of such compounds may be advantageously 
enhanced by chemical modification using the natural polypeptide as a starting point 
and incorporating changes that provide advantages for use, for example, increased 
stability to proteolytic degradation, reduced antigenicity, improved tissue penetration, 

20 and/or improved delivery characteristics. 

In addition to active modifications and derivative creations, it can also be 
useful to provide inactive modifications or derivatives for use as negative controls or 
introduction of immunologic tolerance. For example, a biologically inactive 
derivative which has essentially the same epitopes as the corresponding natural 

25 antimicrobial can be used to induce immunological tolerance in a patient being 

treated. The induction of tolerance can then allow uninterrupted treatment with the 
active anti-microbial to continue for a significantly longer period of time. 

Modified anti-microbial polypeptides and derivatives can be produced using a 
number of different types of modifications to the amino acid chain. Many such 

30 methods are known to those skilled in the art. The changes can include, for example, 
reduction of the size of the molecule, and/or the modification of the amino acid 
sequence of the molecule. In addition, a variety of different chemical modifications of 
the naturally occurring polypeptide can be used, either with or without modifications 
to the amino acid sequence or size of the molecule. Such chemical modifications can, 

35 for example, include the incorporation of modified or non-natural amino acids of^ldh- 
amino acid moieties during synthesis of the peptide chain, or the post-synthesis 
modification of incorporated chain moieties. 
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The oligopeptides of this invention can be synthesized chemically or through 
an appropriate gene expression system. Synthetic peptides can include both naturally 
occurring amino acids and laboratory synthesized, modified amino acids. 

Also provided herein are functional derivatives of anti-microbial proteins or 
5 polypeptides. By "functional derivative" is meant a "chemical derivative," 

"fragment," "variant," "chimera," or "hybrid" of the polypeptide or protein, which 
terms are defined below. A functional derivative retains at least a portion of the 
function of the protein, for example reactivity with a specific antibody, enzymatic 
activity or binding activity. 

10 A "chemical derivative" of the complex contains additional chemical moieties 

not normally a part of the protein or peptide. Such moieties may improve the 
molecules solubility, absorption, biological half-life, and the like. The moieties may 
alternatively decrease the toxicity of the molecule, eliminate or attenuate any 
undesirable side effect of the molecule, and the like. Moieties capable of mediating 

15 such effects are disclosed in Alfonso and Gennaro (1995). Procedures for coupling 
such moieties to a molecule are well known in the art. Covalent modifications of the 
protein or peptides are included within the scope of this invention. Such 
modifications may be introduced into the molecule by reacting targeted amino acid 
residues of the peptide with an organic derivatizing agent that is capable of reacting 

20 with selected side chains or terminal residues, as described below. 

Cysteinyl residues most commonly are reacted with alpha-haloacetates (and 
corresponding amines), such as chloroacetic acid or chloroacetamide, to give 
carboxymethyl or carboxyamidomethyl derivatives. Cysteinyl residues also are 
derivatized by reaction with bromotrifluoroacetone, chloroacetyl phosphate, N- 

25 alkylmaleimides, 3-nitro-2-pyridyl disulfide, methyl 2-pyridyl disulfide, p-chloro- 
mercuribenzoate, 2-chloromercuri-4-nitrophenol, or chloro-7-nitrobenzo-2-oxa-l,3- 
diazole. 

Histidyl residues are derivatized by reaction with diethylprocarbonate at pH 
5.5-7.0 because this agent is relatively specific for the histidyl side chain. Para- 
30 bromophenacyl bromide also is useful; the reaction is preferably performed in 0.1 M 
sodium cacodylate at pH 6.0. 

Lysinyl and amino terminal residues are reacted with succinic or other 
carboxylic acid anhydrides. Derivatization with these agents has the effect of 
reversing the charge of the lysinyl residues. Other suitable reagents for derivatizing 
35 primary amine- containing residues include imidoesters such as methyl ^ 
picolinimidate; pyridoxal phosphate; pyridoxal; chloroborohydride; 
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trinitrobenzenesulfonic acid; O-methylisourea; 2,4 pentanedione; and transaminase- 
catalyzed reaction with glyoxylate. 

Arginyl residues are modified by reaction with one or several conventional 
reagents, among them phenylglyoxal, 2,3-butanedione, 1,2-cyclohexanedione, and 
5 ninhydrin. Derivatization of arginine residues requires that the reaction be performed 
in alkaline conditions because of the high pK, of the guanidine functional group. 
Furthermore, these reagents may react with the groups of lysine as well as the arginine 
alpha-amino group. 

Tyrosyl residues are well-known targets of modification for introduction of 

10 spectral labels by reaction with aromatic diazonium compounds or tetranitromethane. 
Most commonly, N-acetylimidizol and tetranitromethane are used to form O-acetyl 
tyrosyl species and 3-nitro derivatives, respectively. 

Carboxyl side groups (aspartyl or glutamyl) are selectively modified by 
reaction carbodiimide (R'-N-C-N-R 1 ) such as l-cyclohexyl-3-(2-morpholinyl(4-ethyl) 

15 carbodiimide or l-ethyl-3-(4-azonia-4,4-dimethylpentyl) carbodiimide. Furthermore, 
aspartyl and glutamyl residues are converted to asparaginyl and glutaminyl residues 
by reaction with ammonium ions. 

Glutaminyl and asparaginyl residues are frequently deamidated to the 
corresponding glutamyl and aspartyl residues. Alternatively, these residues are 

20 deamidated under mildly acidic conditions. Either form of these residues falls within 
the scope of this invention. 

Derivatization with Afunctional agents is useful, for example, for cross- 
linking component peptides to each other or the complex to a water-insoluble support 
matrix or to other macromolecular carriers. Commonly used cross-linking agents 

25 include, for example, 1,1 -bis (diazoacetyl)-2-phenylethane, glutaraldehyde, N- 

hydroxysuccinimide esters, for example, esters with 4-azidosalicylic acid, homobi- 
functional imidoesters, including disuccinimidyl esters such as 3,3- 
dithiobis(succinimidylpropionate), and Afunctional maleimides such as bis-N- 
maleimido-l,8-octane. Derivatizing agents such as methyl-3-[p-azidophenyl) 

30 dithiolpropioimidate yield photoactivatable intermediates that are capable of forming 
crosslinks in the presence of light. Alternatively, reactive water-insoluble matrices 
such as cyanogen bromide-activated carbohydrates and the reactive substrates 
described in U.S. Patent Nos. 3,969,287; 3,691,016; 4,195,128; 4,247,642; 4,229,537; 
and 4,330,440 are employed for protein immobilization. 

35 Other modifications include hydroxylation of proline and lysine, 

phosphorylation of hydroxyl groups of seryl or threonyl residues, methylation of the 
alpha-amino groups of lysine, arginine, and histidine side chains (Creighton, T.E., 
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Proteins: Structure and Molecular Properties, W.H. Freeman & Co., San Francisco, 
pp. 79-86 (1983)), acetylation of the N-terminal amine, and, in some instances, 
amidation of the C-terminal carboxyl groups. 

Such derivatized moieties may improve the stability, solubility, absorption, 
5 biological half life, and the like. The moieties may alternatively eliminate or attenuate 
any undesirable side effect of the protein complex. Moieties capable of mediating 
such effects are disclosed, for example, in Alfonso and Gennaro (1995). 

The term "fragment" is used to indicate a polypeptide derived from the amino 
acid sequence of the protein or polypeptide having a length less than the full-length 

10 polypeptide from which it has been derived. Such a fragment may, for example, be 
produced by proteolytic cleavage of the full-length protein. Preferably, the fragment 
is obtained recombinantly by appropriately modifying the DNA sequence encoding 
the proteins to delete one or more amino acids at one or more sites of the C-terminus, 
N-terminus, and/or within the native sequence. 

1 5 Another functional derivative intended to be within the scope of the present 

invention is a "variant" polypeptide that either lacks one or more amino acids or 
contains additional or substituted amino acids relative to the native polypeptide. The 
variant may be derived from a naturally occurring polypeptide by appropriately 
modifying the protein DNA coding sequence to add, remove, and/or to modify codons 

20 for one or more amino acids at one or more sites of the C-terminus, N-terminus, 
and/or within the native sequence. 

A functional derivative of a protein or polypeptide with deleted, inserted 
and/or substituted amino acid residues may be prepared using standard techniques 
well-known to those of ordinary skill in the art. For example, the modified 

25 components of the functional derivatives may be produced using site-directed 
mutagenesis techniques (as exemplified by Adelman et al., 1983, DNA 2:183; 
Sambrook et al., 1989) wherein nucleotides in the DNA coding sequence are modified 
such that a modified coding sequence is produced, and thereafter expressing this 
recombinant DNA in a prokaryotic or eukaryotic host cell, using techniques such as 

30 those described above. Alternatively, components of functional derivatives of 
complexes with amino acid deletions, insertions and/or substitutions may be 
conveniently prepared by direct chemical synthesis, using methods well-known in the 
art. 

Insofar as other anti-microbial inhibitor compounds identified by the invention 
35 described herein may not be peptidal in nature, other chemical techniques exisfto 
allow their suitable modification, as well, and according the desirable principles 
discussed above. 
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Administration and Pharmaceutical Compositions 

For the therapeutic and prophylactic treatment of infection, the preferred 
method of preparation or administration of anti-microbial compounds will generally 
5 vary depending on the precise identity and nature of the anti-microbial being 

delivered. Thus, those skilled in the art will understand that administration methods 
known in the art will also be appropriate for the compounds of this invention. 

The particularly desired anti-microbial can be administered to a patient either 
by itself, or in pharmaceutical compositions where it is mixed with suitable carriers or 

10 excipient(s). In treating an infection, a therapeutically effective amount of an agent or 
agents is administered. A therapeutically effective dose refers to that amount of the 
compound that results in amelioration of one or more symptoms of bacterial infection 
and/or a prolongation of patient survival or patient comfort. 

Toxicity, therapeutic and prophylactic efficacy of anti-microbials can be 

1 5 determined by standard pharmaceutical procedures in cell cultures and/or 

experimental organisms such as animals, e.g., for determining the LD 50 (the dose 
lethal to 50% of the population) and the ED 50 (the dose therapeutically effective in 
50% of the population). The dose ratio between toxic and therapeutic effects is the 
therapeutic index and it can be expressed as the ratio LD 50 /ED 50 . Compounds that 

20 exhibit large therapeutic indices are preferred. The data obtained from these cell 

culture assays and animal studies can be used in formulating a range of dosage for use 
in humans. The dosage of such compounds lies preferably within a range of 
circulating concentrations that include the ED 50 with little or no toxicity. The dosage 
may vary within this range depending upon the dosage form employed and the route 

25 of administration utilized. 

For any compound identified and used in the method of the invention, the 
therapeutically effective dose can be estimated initially from cell culture assays. Such 
information can be used to more accurately determine useful doses in organisms such 
as plants and animals, preferably mammals, and most preferably humans. Levels in 

30 plasma may be measured, for example, by HPLC or other means appropriate for 
detection of the particular compound. 

The exact formulation, route of administration and dosage can be chosen by 
the individual physician in view of the patient's condition (see e.g. Fingl et. al., in The 
Pharmacological Basis of Therapeutics, 1975, Ch. 1 p.l). 

35 It should be noted that the attending physician would know how" and when' to 

terminate, interrupt, or adjust administration due to toxicity, organ dysfunction, or 
other systemic malady. Conversely, the attending physician would also know to 
adjust treatment to higher levels if the clinical response were not adequate (precluding 
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toxicity). The magnitude of an administered dose in the management of the disorder 
of interest will vary with the severity of the condition to be treated and the route of 
administration. The severity of the condition may, for example, be evaluated, in part, 
by standard prognostic evaluation methods. Further, the dose and perhaps dose 
5 frequency, will also vary according to the age, body weight, and response of the 
individual patient. A program comparable to that discussed above also may be used 
in veterinary or phyto medicine. 

Depending on the specific infection target being treated and the method 
selected, such agents may be formulated and administered systemically or locally, i.e., 

10 topically. Techniques for formulation and administration may be found in Alfonso 
and Gennaro (1995). Suitable routes may include , for example, oral, rectal, 
transdermal, vaginal, transmucosal, intestinal, parenteral, intramuscular, 
subcutaneous, or intramedullary injections, as well as intrathecal, intravenous, or 
intraperitoneal injections. 

1 5 For injection, the agents of the invention may be formulated in aqueous 

solutions, preferably in physiologically compatible buffers such as Hanks' solution, 
Ringer's solution, or physiological saline buffer. For transmucosal administration, 
penetrants appropriate to the barrier to be permeated are used in the formulation. 
Such penetrants are generally known in the art. 

20 Use of pharmaceutical^ acceptable carriers to formulate identified anti- 

microbials of the present invention into dosages suitable for systemic administration is 
within the scope of the invention. With proper choice of carrier and suitable 
manufacturing practice, the compositions of the present invention, in particular those 
formulated as solutions, may be administered parenterally, such as by intravenous 

25 injection. Appropriate compounds can be formulated readily using pharmaceutically 
acceptable carriers well known in the art into dosages suitable for oral administration. 
Such carriers enable the compounds of the invention to be formulated as tablets, pills, 
capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by 
a patient to be treated. 

30 Agents intended to be administered intracellularly may be administered using 

techniques well known to those of ordinary skill in the art. For example, such agents 
may be encapsulated into liposomes, then administered as described above. 
Liposomes are spherical lipid bilayers with aqueous interiors. All molecules present 
in an aqueous solution at the time of liposome formation are incorporated into .the 

35 aqueous interior. The liposomal contents are both protected from the external 

microenvironment and, because liposomes fuse with cell membranes, are efficiently 
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delivered into the cell cytoplasm. Additionally, due to their hydrophobicity, small 

organic molecules may be directly administered intracellular!^ 

Pharmaceutical compositions suitable for use in the present invention include 

compositions wherein the active ingredients are contained in an effective amount to 
5 achieve the intended purpose. Determination of the effective amounts is well within 

the capability of those skilled in the art. 

In addition to the active ingredients, these pharmaceutical compositions may 

contain suitable pharmaceutical^ acceptable carriers comprising excipients and 

auxiliaries which facilitate processing of the active compounds into preparations 
0 which can be used pharmaceutical^. The preparations formulated for oral 

administration may be in the form of tablets, dragees, capsules, or solutions, including 

those formulated for delayed release or only to be released when the pharmaceutical 

reaches the small or large intestine. 

The pharmaceutical compositions of the present invention may be 
5 manufactured in a manner that is itself known, e.g. , by means of conventional mixing, 

dissolving, granulating, dragee-making, levitating, emulsifying, encapsulating, 

entrapping or lyophilizing processes. 

Pharmaceutical formulations for parenteral administration include aqueous 

solutions of the active anti-microbial compounds in water-soluble form. 
0 Alternatively, suspensions of the active compounds may be prepared as appropriate 

oily injection suspensions. Suitable lipophilic solvents or vehicles include fatty oils 

such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or triglycerides, 

or liposomes. Aqueous injection suspensions may contain substances which increase 

the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
5 dextran. Optionally, the suspension may also contain suitable stabilizers or agents 

which increase the solubility of the compounds to allow for the preparation of highly 

concentrated solutions. 

Pharmaceutical preparations for oral use can be obtained by combining the 

active compounds with solid excipient, optionally grinding a resulting mixture, and 
3 processing the mixture of granules, after adding suitable auxiliaries, if desired, to 

obtain tablets or dragee cores. Suitable excipients are, in particular, fillers such as 

sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose preparations such 

as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, gum 

tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium ... ^ , 

5 carboxymethylcellulose, and/or polyvinylpyrrolidone (P VP). If desired, - ~~ 

disintegrating agents may be added, such as the cross-linked polyvinyl pyrrolidone, 

agar, or alginic acid or a salt thereof such as sodium alginate. 
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Dragee cores are provided with suitable coatings. For this purpose, 
concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium 
dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. 
5 Dyestuffs or pigments may be added to the tablets or dragee coatings for identification 
or to characterize different combinations of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit 
capsules made of gelatin, as well as soft, sealed capsules made of gelatin and a 
plasticizer, such as glycerol or sorbitol. The push-fit capsules can contain the active 

10 ingredients in admixture with filler such as lactose, binders such as starches, and/or 
lubricants such as talc or magnesium stearate and, optionally, stabilizers. In soft 
capsules, the active compounds may be dissolved or suspended in suitable liquids, 
such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. 

15 The above methodologies may be employed either actively or prophylactically 

against an infection of interest. 

Computer-related Aspects and Embodiments 

In addition to the provision of compounds as chemical entities, nucleotide 

20 sequences, or fragments thereof at least 95%, preferably at least 97%, more preferably 
at least 99%, and most preferably at least 99.9% identical to phage inhibitor sequences 
can also be provided in a variety of additional media to facilitate various uses. 

Thus, as used in this section, "provided" refers to an article of manufacture, 
rather than an actual nucleic acid molecule, which contains a nucleotide sequence of 

25 the present invention; e.g., a nucleotide sequence of an exemplary bacteriophage or a 
sequence encoding a bacterial target or a fragment thereof, preferably a nucleotide 
sequence at least 95%, more preferably at least 99% and most preferably at least 
99.9% identical to such a bacteriophage or bacterial sequence, for example, to a 
polynucleotide of an unsequenced phage listed in Table 1, preferably of bacteriophage 

30 77 (S. aureus host) or bacteriophage 3A (S.aureus host) or bacteriophage 96 (£. 

aureus host). Such an article provides a large portion of the particular bacteriophage 
genome or bacterial gene and parts thereof (e.g., a bacteriophage open reading frame 
(ORF)) in a form which allows a skilled artisan to examine and/or analyze the 
sequence using means not directly applicable to examining the actual genome or gene. , 

35 or subset thereof as it exists in nature or in purified form as a chemical entity ~ 
In one application of this aspect, a nucleotide sequence of the present 
invention can be recorded on computer readable media. As used herein, "computer 
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readable media" refers to any medium that can be read and accessed directly by a 
computer. Such media include, but are not limited to: magnetic storage media, such 
as floppy discs, hard disc storage medium, magnetic tape; optical storage media such 
as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these 
5 categories, such as magnetic/optical storage media. A skilled artisan can readily 
appreciate how any of the presently known computer readable mediums can be used 
to create an article of manufacture which includes one or more computer readable 
media having recorded thereon a nucleotide sequence or sequences of the present 
invention. Likewise, it will be clear to those of skill how additional computer 

10 readable media that may be developed also can be used to create analogous 

manufactures having recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on 
computer readable medium. A skilled artisan can readily adopt any of the presently 
known methods for recording information on computer readable medium to generate 

1 5 manufactures comprising the nucleotide sequence information of the present 
invention. 

A variety of data storage structures are available to a skilled artisan for 
creating a computer readable medium having recorded thereon a nucleotide sequence 
of the present invention. The choice of the data storage structure will generally be 

20 based on the means chosen to access the stored information. In addition, a variety of 
data processor programs and formats can be used to store the nucleotide sequence 
information of the present invention on computer readable medium. The sequence 
information can, for example, be presented in a word processing test file, formatted in 
commercially available software such as WordPerfect and Microsoft Word, or 

25 represented in the form of an ASCII file, stored in a database application, such as 
DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of 
data processor structuring formats (e.g., text file or database) in order to obtain 
computer readable medium having recorded thereon the nucleotide sequence 
information of the present invention. 

30 Computer software is publicly available which allows a skilled artisan to 

access sequence information provided in a computer readable medium. Thus, by 
providing in computer readable form a nucleotide sequence of an unsequenced 
bacteriophage, such as an exemplary bacteriophage listed in Table 1 or of a sequence 
encoding a bacterial target or a fragment thereof, preferably a nucleotide sequencejit . ? 

35 least 95%, more preferably at least 99% and most preferably at least 99.9% identical 
to such a bacteriophage or bacterial sequence, for example, to a polynucleotide of 
bacteriophage 77 (S. aureus host) or bacteriophage 3A (S.aureus host) bacteriophage 
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96 (S. aureus host), bacteriophage 44AHJD (S. aureus host), bacteriophage Dp-1 
(Streptococcus pneumoniae host), or bacteriophage 182 (Enterococcus host) the 
present invention enables the skilled artisan to routinely access the provided sequence 
information for a wide variety of purposes. 
5 Those skilled in the art understand that software can implement a variety of 

different search or analysis software which implement sequence search and analysis 
algorithms, e.g., the BLAST (Altschul et al., J. Mol. Biol. 215:403410 (1990) and 
BLAZE (Brutlag et ah, Comp. Chem 17:203-207 (1993)) search algorithms. For 
example, such search algorithms can be implemented on a Sybase system and used to 
10 identify open reading frames (ORFs) within the bacteriophage genome which contain 
homology to ORFs or proteins from other viruses, e.g, other bacteriophage, and other 
organisms, e.g., the host bacterium. Among the ORFs discussed herein are protein 
encoding fragments of the bacteriophage genomes which encode bacteria-inhibiting 
proteins or fragments. 

15 The present invention further provides systems, particularly computer-based 

systems, which contain the sequence information described. Such systems are 
designed to identify, among other things, useful fragments of the bacteriophage 
genomes. 

As used herein, "a computer-based system" refers to the hardware, software, 

20 and data storage media used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware of the computer-based systems of the t 
present invention comprises a central processing unit (CPU), input device, output 
device, and data storage medium or media. A skilled artisan will readily recognize 
that any of the currently available general purpose computer-based system are suitable 

25 for use in the present invention, as well as a variety of different specialized or 
dedicated computer-based systems. 

As stated above, the computer-based systems of the present invention 
comprise data storage media having stored therein a nucleotide sequence of the 
present invention and the necessary hardware and software for supporting and 

30 implementing a search and/or analysis program. 

As used herein, "data storage media" refers to memory which can store 
nucleotide sequence information of the present invention, or a memory access means 
which can access manufactures having recorded thereon the nucleotide sequence 
information of the present invention. 

35 As used herein, "search program" refers to one or more programs which are 

implemented on the computer-based system to compare a target sequence or target 
structural motif with the sequence information stored within the data storage means. 
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Search means are used to identify fragments or regions of the present gnomic 
sequences which match a particular target sequence or target motif. A variety of 
known algorithms are disclosed publicly and a variety of commercially available 
software for conducting search means are and can be used in the computer-based 
5 systems of the present invention. Examples of such software includes, but is not 
limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBIA). A skilled artisan 
can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches and/or sequence analyses can be 
adapted for use in the present computer-based systems. 

1 0 As used herein in connection with sequence searches and analyses, a "target 

sequence" can be any DNA or amino acid sequence of six or more nucleotides or two 
or more amino acids. A skilled artisan can readily recognize that the longer a target 
sequence is, the less likely a target sequence will be present as a random occurrence in 
the database. Also, the target sequence length is preferably selected to include 

1 5 sequence corresponding to a biologically relevant portion of an encoded product, for 
example a region which is expected to be conserved across a range of source 
organisms. Preferably the sequence length of a target polypeptide sequence is from 5- 
100 amino acids, more preferably 7-50 or 7-100 amino acids, and still more preferably 
10-80 or 10-100 amino acids. Preferably the sequence length of a target 

20 polynucleotide sequence is from 1 5-300 nucleotide residues, more preferably from 21 - 
240 or 21-300, and still more preferably 30-150 or 30-300 nucleotide residues. 
However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may 
be of shorter length. Likewise, it may be desirable to search and/or analyze longer 

25 sequences. 

As used herein, "a target structural motif," or "target motif," refers to any 
rationally selected sequence or combination of sequences in which the sequenced) are 
chosen based on a three-dimensional configuration which is formed upon the folding 
of the target motif. There are a variety of target motifs known in the art. Protein 
30 target motifs include, but are not limited to, enzymatic active sites and signal 
sequences. Nucleic acid target motifs include, but are not limited to promoter 
sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

A variety of structural formats for the input and output devices can be used to 
3 5 input and output the information in the computer-based systems of the pressnr~ 
invention. A preferred format for an output device ranks fragments of the 
bacteriophage or bacterial sequences possessing varying degrees of homology to the 
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target sequence or target motif Such presentation provides a skilled artisan with a 
ranking of sequences which contain various amounts of the target sequence or target 
motif and identifies the degree of homology contained in the identified fragment. 

A variety of comparing methods and/or devices and/or formats can be used to 
5 compare a target sequence or target motif with the sequence stored in data storage 
media to identify sequence fragments of the bacteriophage or bacterium in question. 
One skilled in the art can readily recognize that any one of the publicly available 
homology search programs, can be used as the search program for the computer-based 
systems of the present invention. Of course, suitable proprietary systems that may be 
10 known to those of skill, or later developed, also may be employed in this regard. 

Figure 6 provides a block diagram of a computer system illustrative of 
embodiments of this aspect of present invention. The computer system 102 includes a 
processor 106 connected to a bus 104. Also connected to the bus 104 are a main 
memory 108 (preferably implemented as random access memory, RAM) and a variety 
1 5 of secondary storage devices 1 1 0, such as a hard drive 1 1 2 and a removable medium 
storage device 1 14. The removable medium storage devicel 14 may represent, for 
example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. A 
removable storage medium 1 16 (such as a floppy disk, a compact disk, a magnetic 
tape, etc.) containing control logic and/or data recorded therein may be inserted into 
20 the removable medium storage device 1 14. The computer system 102 includes 

appropriate software for reading the control logic and/or the data from the removable 
medium storage device 1 14, once it is inserted into the removable medium storage 
device 1 14. 

A nucleotide sequence of the present invention may be stored in a well-known 
25 manner in the main memory 1 08, any of the secondary storage devices 1 1 0, and/or a 
removable storage medium 116; During execution, software for accessing and 
processing the sequence (such as search tools, comparing tools, etc.) reside in main 
memory 108, in accordance with the requirements and operating parameters of the 
operating system, the hardware system and the software program or programs. 
30 The data storage medium in which the sequence is embodied and the central 

processor need not be part of a single stand-alone computer, but may be separated so 
long as data transfer can occur. For example, the processor or processors being 
utilized for a search or analysis can be part of one general purpose computer, and the 
data storage medium can be part of a second general purpose computer connected to £ T 
35 network, or the data storage medium can be part of a network server. As another 
example the data storage medium can be part of a computer system or network 
accessible over telephone lines or other remote connection method. 
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EXAMPLES 

Example 1 . Growth of Staph A bacteriophage 77 and purification of genomic 
SNA. 

5 The Staphylococcus aureus propagating strain (PS 77; ATCC #27699) was 

used as a host to propagate its respective phage 77 (ATCC # 27699-B1). Two rounds 
of plaque purification of phage 77 were performed on soft agar essentially as 
described in Sambrook et al (1989). Briefly, the PS 77 strain was grown overnight at 
37°C in Nutrient broth [NB: 0.3% Bacto beef extract, 0.5% Bacto peptone (Difco 
10 Laboratories) and 0.5% NaCl (w/v)].The culture was then diluted 20x in NB and 
incubated at 37°C until the OD 340 = .2 (early log phase) with constant agitation. In 
order to obtain single plaques, phage 77 was subjected to 10-fold serial dilutions using 
phage buffer (1 mM MgS0 4 , 5 mM MgCl 2 , 80 mM NaCl and 0. 1% Gelatin (w/v)) and 
10 ill of each dilution was used to infect 0.5 ml of the cell suspension in the presence 
1 5 of 400 jig/ml CaCl 2 . After incubation of 1 5 min at room temperature (RT), 2 ml of 
melted soft agar kept at 45°C (NB supplemented with 0.6% agar) was added to the 
mixture and poured onto the surface of 100 mm nutrient agar plates (0.3% Bacto Beef 
extract, 0.5% Bacto peptone, 0.5% NaCl and 1.5% Bacto agar (w/v)). After overnight 
incubation at 30°C, a single plaque was isolated, resuspended in 1 ml of phage buffer 
by end over end rotation for 2 hrs at 20°C, and the phage suspension was diluted and 
used for a second infection as described above. After overnight incubation at 30°C, a 
single plaque was isolated and used as a stock. 

The propagation procedure for bacteriophage 77 was modified from the agar 
layer method of Swanstorm and Adams (1951). Briefly, the PS 77 strain was grown to 
stationary phase overnight at 37°C in Nutrient broth. The culture was then diluted 
twenty-fold in NB and incubated at 37°C until the OD 540 = .2. The suspension (15xl0 7 
Bacteria) was then mixed with 15xl0 5 plaque forming units (pfu) to give a ratio of 
100-bacteria/phage particle in the presence of 400 ng/ml of CaCl 2 . After incubation 
for 15 min at 20°C, 7.5 ml of melted soft agar (NB plus 0.6% agar) were added to the 
mixture and poured onto the surface of 150 mm nutrient agar plates and incubated 16 
hrs at 30°C. To collect the phage plate lysate, 20 ml of NB were added to each plate 
and the soft agar layer was collected by scrapping off with a clean microscope slide 
followed by shaking of the agar suspension for 5 min to break up the agar. The 
mixture was then centrifuged for 10 min at 4,000 RPM (2,830xg) in a JA-10 rotor-- " * 
(Beckman) and the supernatant fluid (lysate) was collected and subjected toa 
treatment with 10 jig /ml of DNase I and RNase A for 30 min at 37°C. To precipitate 
the phage particles, the phage suspension was adjusted to 10% (w/v) PEG 8000 and 
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0.5 M of NaCl followed by incubation at 4°C for 16 hrs. The phage was recovered by 
centrifugation at 4,000 rpm (3,500xg) for 20 min at 4°C on a GS-6R table top 
centrifuge (Beckman). The pellet was resuspended with 2 ml of phage buffer (1 mM 
MgS0 4 , 5 mM MgCl 2 , 80 mM NaCl and 0.1% Gelatin). The phage suspension was 
5 extracted with 1 volume of chloroform and further purified by centrifugation on a 
cesium chloride step gradient as described in Sambrook et al. (1989), using a TLS 55 
rotor centrifoged in an Optima TLX ultracentrifuge (Beckman) for 2 h at 28,000 rpm 
(67,000xg) at 4°C Banded phage was collected and ultracentrifuged again on an 
isopycnic cesium chloride gradient (1 .45 g/ml) at 40,000 rpm (64,000xg) for 24 h at 

10 4°C using a TLV rotor (Beckman). The phage was harvested and dialyzed for 4 h at 
room temperature against 4 L of dialysis buffer consisting of 10 mM NaCl, 50 mM 
Tris-HCl [pH 8] and 10 mM MgCl 2 . Phage DNA was prepared from the phage 
suspension by adding 20 mM EDTA, 50 mg/ml Proteinase K and 0.5% SDS and 
incubating for 1 h at 65°C, followed by successive extractions with 1 volume of 

15 phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was 
then dialyzed overnight at 4°C against 4 L of TE (10 mM Tris pH 8.0, ImM EDTA). 

Example 2. DNA sequencing of Bacteriophage 77 genome 

Four micrograms of phage 77 DNA was diluted in 200 \il of TE (10 mM Tris, 
20 [pH 8.0], 1 mM EDTA) in a 1.5 ml eppendorf tube and sonication was performed 
(550 Sonic Dismembrator™, Fisher Scientific). Samples were sonicated under an 
amplitude of 3 ^im with bursts of 5 s spaced by 15 s cooling in ice/water for 3 to 4 
cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% 
agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acetate, 1 mM EDTA [pH 8.0]) 
25 as the running buffer. Fractions ranging from 1 to 2 kbp were excised from the 

agarose gel and purified using a commercial DNA extraction system according to the 
instructions of the manufacturer (Qiagen), with a final elution of 50 ^il of 1 mM Tris 
(pH8.5). 

The ends of the sonicated DNA fragments were repaired with a combination of 
30 T4 DNA polymerase and the Klenow fragment of E. coli DNA polymerase I, as 
follows. Reactions were performed in a reaction mixture (final volume, 100 ^il) 
containing sonicated phage DNA, 10 mM Tris-HCl [pH 8.0], 50 mM NaCl, 10 mM 
MgCl 2 , 1 mM DTT, 50 ^g/ml BSA, 100 jiM of each dNTP and 15 units of T4 DNA 
polymerase (New England Biolabs) for 20 min at 12°C followed by addition of 12.5 . , 
35 units of Klenow large fragment (New England Biolabs) for 15 min at room- 

temperature. The reaction was stopped by two phenol/chloroform extractions and the 
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DNA was precipitated with ethanol and the final DNA pellet was resuspended in 20 
jil ofH 2 0. 

Blunt-ended DNA fragments were cloned by ligation directly into the Hinc II 
site of pKSII+ vector (New England Biolabs) dephosphorylated by treatment with calf 
5 intestinal alkaline phosphatase (New England Biolabs)-treated pKS 11+ vector 

(Stratagene). A typical ligation reaction contained 100 ng of vector DNA, 2 to 5 |xl of 
repaired sonicated phage DNA (50-100 ng) in a final volume of 20 ^1 containing 800 
units of T4 DNA ligase (New England Biolabs) and was incubated overnight at 16°C. 
Transformation and selection of bacterial clones containing recombinant plasmids was 
1 0 performed in E. coli DH1 OP according to standard procedures (Sambrook et al., 
1989). 

Recombinant clones were picked from agar plates into 96-well plates 
containing 100 jil LB and 100 |ig/ml ampicillin and incubated at 37°C. The presence 
of phage DNA insert was confirmed by PCR amplification using T3 and 17 primers 

1 5 flanking the Hinc II cloning site of the pKS 11+ vector. PCR amplification of foreign 
insert was performed in a 15 ^1 reaction volume containing 10 mM Tris (pH 8.3), 50 
mM KC1, 1.5 mM MgCl 2 , 0.02% gelatin, 1 |iM primer, 187.5 \xM each dNTP, and 
0.75 units Tag polymerase (BRL). The thermocycling parameters were as follows: 2 
min initial denaturation at 94°C for 2 min, followed by 20 cycles of 30 sec 

20 denaturation at 94°C, 30 sec annealing at 57°C, and 2 min extension at 72°C, 

followed by a single extension step at 72°C for 10 min. Clones with insert sizes of 1 
to 2 kbp were selected and plasmid DNA was prepared from the selected clones using 
QIAprep™ spin miniprep kit (Qiagen). 

The nucleotide sequence of the extremities of each recombinant clone was 

25 determined using an ABI 377-36 automated sequencer with two types of chemistry: 
ABI prism Big Dye™ primer or ABI prism Big Dye™ terminator cycle sequencing 
ready reaction kit (Applied Biosystems). To ensure co-linearity of the sequence data 
and the genome, all regions of phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 

30 sequencing primer was selected and phage DNA was used directly as sequencing 

template employing ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit. 

Exannttplle 3. Bioinformatic management of primary nucleotide sequence from 
35 PhsgeTL ' _ " 

Phage 77 sequence contigs were assembled using Sequencher™ 3.1 software 
(GeneCodes). To close contig gaps, sequencing primers were selected near the edge of 
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the contigs. Phage DNA was used directly as sequencing template employing ABI 
prism BIG DYE™ terminator cycle sequencing ready reaction kit. The complete 
sequence of bacteriophage 77 is shown in Table 2. 

A software program was developed and used on the assembled sequence of 
5 bacteriophage 77 to identify all putative ORFs larger than 33 codons. Other ORF 
identification software can also be utilized, preferably programs which allow 
alternative start codons. The software scans the primary nucleotide sequence starting 
at nucleotide #1 for an appropriate start codon. Three possible selections can be made 
for defining the nature of the start codon; I) selection of ATG, II) selection of ATG or 
1 0 GTG, and III) selection of either ATG, GTG, TTG, CTG, ATT, ATC, and ATA. This 
latter initiation codon set corresponds to the one reported by the NCBI 
(http://www.ncbi.nlm.nih.gov/htbin-post/Taxonomv/wprintgc?mode=c) for the 
bacterial genetic code. 

When an appropriate start codon is encountered, a counting mechanism is 
15 employed to count the number of codons (groups of three nucleotides) between this 
start codon and the next stop codon downstream of it. If a threshold value of 33 is 
reached, or exceeded, then the sequence encompassed by these two codons (start and 
stop codons) is defined as an ORF. This procedure is repeated, each time starting at 
the next nucleotide following the previous stop codon found, in order to identify all 
20 the other putative ORFs. The scan is performed on all three reading frames of both 
DNA strands of the phage sequence. 

Sequence homology (BLAST) searches for each ORF are then carried out 
using an implementation of BLAST programs, although any of a variety of different 
sequence comparison and matching programs can be utilized as known to those 
25 skilled in the art. Downloaded public databases used for sequence analysis include: 

i) non-redundant GenBank (ftp://ncbi.nlm.nih.gov^last/db/nr.Z), 

ii) Swissprot (ftp://ncbi.nlm.nih.gOv/blast/db/swissprot.Z); 

iii) vector (ftp://ncbi.nlm.nih.gOv/blast/db/vector.Z); 

iv) pdbaa databases (ftp://ncbi.nlm.nih.gov/blast/db/pdbaaZ); 

30 v) S. aureus NCTC 8325 (ftp://ftp.genome.ou.edu/pub/st^)h/staph-lk.fa); 

vi) streptococcus pyogenes (ftp://ftp.genome.ou.edu/pub/strep/strep-lk.fa); 

vii) Streptococcus pneumoniae 

(ftp://ftp.tigr.org/pub/data/s_j3neumoniae/gsp.contigs. 1121 97.Z); 

viii) Mycobacterium tuberculosis CSU#9 

35 (ftp://ftp.tigr.Org/pub/data/mJuberculosis/TB_091097.Z) and 

ix) pseudomonas aeruginosa fhtt p^/www^eenome.washington.edu/pseudo/data.htmn . 
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The results of the homology searches performed on the ORFs is shown in 
Table 5. 

Example 4. Subcloning of Bacteriophage 77 ORFs into a Staph A inducible 
5 ex pression system. 

The shuttle vector pT0021, in which the firefly luciferase (lucFF) expression 
is controlled by the ars (arsenite) promoter/operator (Tauriainen et al., 1997), was 
modified in the following fashion. Two oligonucleotides corresponding to a short 
antigenic peptide derived from the heamaglutinin protein of influenza virus (HA 
10 epitope tag) were synthesized (Field et al., 1988). The sense strand HA tag sequence 
(with BamHl, Sail and HindlU cloning sites) is: 

5 '-gatcccggtcgaccaagcttTACCCATACGACGTCCC AGACTACGCCAGCTGA-3 ' 
(where upper case letters denote the nucletotide sequence of the HA tag); the antisense 
strand HA tag sequence (with a HindUl cloning site) is: 

15 5 9 -agctTC AGCTGGCGTAGTCTGGGACGTCGTATGGGTAaagcttggtcgaccgg-3 * 
(where upper case letters denote the sequence of the HA tag). The two HA tag 
oligonucleotides were annealed and ligated into pT0021 vector which had been 
digested with BamHl and HindlU. This manipulation resulted in replacement of the 
lucFF gene by the HA tag. This modified shuttle vector containing the arsenite 

20 inducible promoter, the arsR gene, and HA tag was named pTHA. A diagram 
outlining our modification of pT0021 to generate pTHA is shown in Fig. 1 A. 

Each ORF, encoded by Bacteriophage 77, larger than 33 amino acids and 
having a Shine-Dalgarno sequence upstream of the initiation codon was selected for 
functional analysis for bacterial inhibition. In total, 98 ORFs were selected and 

25 screened as detailed below. A list of these is presented in Table 3. Each individual 
ORF, from initiation codon to last codon (excluding the stop codon), was amplified 
from phage genomic DNA using the polymerase chain reaction (PCR). For PCR 
amplification of ORFs, each sense strand primer targets the initiation codon and is 
preceded by a BamHl restriction site ( 5 cgggatcc 3 ) and each antisense oligonucleotide 

30 targets the pentultimate codon (the one before the stop codon) of the ORF and is 

preceded by a Sal I restriction site ( 5 gcgtcgaccg 3 ). The PCR product of each ORF was 
gel purified and digested with BamHl and Sail. The digested PCR product was then 
gel purified using the Qiagen kit as described, ligated into BamHl and Sail digested 
pTHA vector, and used to transform E. coli bacterial strain DH10P(as described _ - 

35 above). As a result of this manipulation, the HA tag is set inframe with the ORF and is 
positioned at the carboxy terminus of each ORF (pTHA/ORF clones). Recombinant 
pTHA/ORF clones were picked and their insert sizes were confirmed by PCR analysis 
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using primers flanking the cloning site. The names and sequences of the primers that 
were used for the PCR amplification were: HAF: 

^TATTATCCAAAACTTGAACA 3 '; HAR: 5 CGGTGGTATATCCAGTGATT r . The 
sequence integrity of cloned ORFs was verified directly by DNA sequencing using 
5 primers HAF and HAR. In cases where verification of ORF sequence could not be 
achieved by one pass with the sequencing primers, additional internal primers were 
selected and used for sequencing. 

Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) was used as a 
recipient for the expression of recombinant plasmids. Electoporation was performed 
10 essentially as previously described (Schenk and Laddaga, 1992). Selection of 

recombinant clones was performed on Luria-Broth agar (LB-agar) plates containing 
30 mj/ml of kanamycin. 

For each ORF introduced in the pTHA plasmid, 3 independent transformants 
were isolated and used to individually inoculate cultures in 5 ml of TSB containing 
15 30ng/ml kanamycin, followed by growth to saturation (16 hrs at 30°C). An aliquot of 
this stationary phase culture was used to generate a frozen glycerol stock of the 
transformant ( stored at - 80°C). The remaining culture was used for plasmid DNA 
extraction. Bacterial cells were harvested by centrifugation at 3000 x g at 22°C for 5 
min. The pellet was resuspended in 200 \x\ 25% sucrose containing 25U/ml of 

20 lysostaphin and incubated for 1 5 min at 37°C. Then, 400^1 of alkaline SDS solution 
(3% SDS, 0.2N NaOH) were added, well mixed and incubated for 7 min at room 
temperature. After the alkaline SDS treatment, 300|xl of ice-cold 3M sodium acetate 
pH 4.8 were added, and the mix is immediately spun at 13000g for 15 min at room 
temperature. The supernatant was transferred to a new 1.5 ml conical centrifuge tube 

25 and 650^1 of isopropanol (stored at room temperature) were added. The mix was then 
centrifuged at 13,000 x g for 5 min. The supernatant fluid was discarded, the pellet 
washed with 70% ethanol, and resuspended in 320 ^1 sterile distilled water. 

The presence of individual phage 77 ORF DNA inserts in the plasmid was 
verified by PCR amplification using 1.5 \i\ transformant miniprep DNA in a PCR 

30 with primers flanking the cloning site of ORF in pTHA vector (HAF and HAR). The 
composition of the PCR reaction and the cycling parameters are identical to those 
employed for library screening described above. 

Example 5, Functional assay for bacterial inhibitory activity of bacteriophage 77 
35 ORFs. ^ ~ ~" 



The anti-microbial activity of individual phage 77 ORFs was monitored by 
two growth inhibitory assays, one on solid agar medium, the other in liquid medium. 
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In general, Staphylococcus bacteria transformed with expression plasmids containing 
individual ORFs were grown in normal TSA medium and stored in 19% glycerol. At 
pre-determined times, arsenite was added to the culture to induce transcription of the 
phage 77 ORFs cloned immediately downstream from an arsenite-inducible promoter 
5 in the pTHA expression plasmid. 

The effect of ORF induction on bacterial growth characteristics was then 
monitored and quantitated. The growth inhibition assay on solid medium was 
performed by streaking pTHA/ORF containing S. aureus transformant onto LB-Kn 
and TSA-Kn plates containing increasing concentrations of sodium arsenite (0; 2.5; 5; 

10 and 7.5 |iM). Arsenite is used to induce the expression of cloned DNA in pTHA 
vector. In parallel, 3 jxl of 1/10 and 1/100 dilutions of the frozen cultures of the 
pTHA/ORF transformants were spotted as single drops onto LB-Kn and TSA-Kn 
plates containing increasing concentration of sodium arsenite (0; 2.5; 5; and 7.5 \iM). 
The plates were then incubated 16 hrs at 37°C, and the effect of arsenite-induced ORF 

1 5 expression on bacterial growth was monitored and quantitated by comparing the 
extent to that seen in control plates. As positive controls for growth inhibition,the 
holin/lysin genes of the Staphylococcus aureus phage Twort (Loessner et al., 1998) 
was subcloned into the pTHA ars inducible vector and used. 

For the growth inhibition assay in liquid medium, stationary phase cultures 

20 were prepared by inoculating 2.5ml TSB-Kn with frozen S. aureus RN4220 
transformants containing phage 77 ORFs cloned in pTHA vector followed by 
incubation for 16 hrs at 37°C. These cultures were then diluted 1/100 in the same 
medium, and the bacteria were allowed to grow for 2 hrs at 37°C to reach early log 
phase. 150 jil of such culture were then mixed with 2.35 ml TSB-Kn medium with or 

25 without arsenite (the final concentration of arsenite in the medium was 0 or 5 nM 
arsenite). After 3.5 hrs incubation at 37°C with shaking at 250 rpm, 100 jal of 
bacterial culture was removed from each tube for OD 565 measurement. Serial ten-fold 
dilutions of the culture in buffered saline solution (0.85% NaCl) were then spotted 
onto TSB-Kn plates. The plates were incubated at 37°C 16 hrs and the number of 

30 surviving colonies counted the following day. The growth inhibitory property of 

individual ORFs was then quantitated by comparing CFU numbers under normal or 

arsenite-induction conditions. A schematic flow of the inhibition analysis is shown in 

Fig. 3 (also applicable to inhibition analysis for the other phage and bacteria pointed 

out herein). Inhibition results are shown in Figures 4A-C. 

35 „ — ~ 

Example 6: Itentifi cation of Cecronin Signature Motif in Staphylococcus aureus 

Bacteriophage 3A ORF 
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The genome for S. aureus bacteriophage 3 A was determined and the sequence 
was analyzed essentially as described for bacteriophage 77 in the examples above. 
Upon blast analysis of the identified open reading frames of phage 3 A, the presence of 
an amino acid sequence corresponding to a cecropin signature motif was observed. 
5 This motif (WDGHKTLEK) is located at position aa 48 1 -489. Cecropins were 
originally identified in proteins from the cecropia moth and are recognized as potent 
antibacterial proteins that constitute an important part of the cell-free immunity of 
insects. Cecropins are small proteins (31-39 amino acid residues) that are active 
against both Gram-positive and Gram-negative bacteria by disrupting the bacterial 
10 membranes. Although the mechanisms by which the cecropons cause cell death are 
not fully understood, it is generally thought to involve channel formation and 
membrane destabilization. 

The identification of a motif corresponding to a known inhibitor suggests that 
the product of ORF002 is also an inhibitory compound. Such inhibitory activity can 
1 5 be confirmed as described herein or by other methods known in the art. Confirmation 
of the inhibitory activity would indicate that the ORF product could serve as the basis 
for construction of mimetic compounds and other inhibitors directed to the target of 
the ORF002 product. 

Boman & Hultmark, 1987, Ann. Rev. Microbiol. 41:103-126. 
20 Boman, 1991, Cell 65:205-207. 

Boman et aL, 1991, Eur. J. Bioichem. 201:23-31. 

Wang et aL,/. Biol Chem. 273:27438-27448. 

Example 7. Growth of Staphylococcus aureus bacteriophage 44AHJD: 
25 Staphylococcus aureus propagating strain (PS 44A) (Felix d'Herelle Reference 

Centre #HER 1101) was used as a host to propagate its respective phage 44AHJD 
(Felix d'Herelle Reference Centre #HER 101). Two rounds of plaque purification of 
phage 44AHJD were performed on soft agar essentially as described in Sambrook et 
aL (1989). Briefly, the Staphylococcus aureus PS strain was grown overnight at 37°C 
30 in Nutrient Broth [NB: 3 g Bacto Beef Extract, 5 g Bactopeptone per liter, (Difco 
Laboratories # 0003-17-8), supplemented with 0.5% NaCl]. The culture was then 
diluted 20 fold in NB and incubated at 37°C until an OD 540 of 0.2. In order to obtain 
single plaques, phage 44AHJD was subjected to 10-fold serial dilutions using the 
phage buffer (1 mM MgS0 4 , 5 mM MgCl 2> 80 mM NaCl and 0.1% Gelatin) -aniirio \d 
35 were used to infect 0.5 ml of the cell suspension in the presence of 400 ng/ml of 
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CaCl 2 . After incubation of 15 min at room temperature, 2 ml of melted soft agar (NB 
supplemented with 0.6% of agar) were added to the mixture and poured onto the 
surface of 100 mm nutrient agar plates (3 g Bacto Beef extract, 5 g Bactopeptone, 
0.5% NaCl and 15 g of Bacto agar per liter (Difco Laboratories # 0001-17-0). After 
5 overnight incubation at 37°C, a single plaque was isolated, resuspended in 1ml of 
phage buffer by end over end rotation for 2 h at room temperature and the phage 
suspension was diluted and used for a second infection as described above. After 
overnight incubation at 37°C, a single plaque was isolated and used as a stock. 

Large scale purification of bacteriophage and preparation of phage DNA was 
10 as follows. 

The propagation method was carried out by using the agar layer method 
described by Swanstfirm and Adams (1951). Briefly, the PS 44A strain was grown to 
stationary phase overnight at 37°C in Nutrient Broth. The culture was then diluted 20x 
in NB and incubated at 37°C until the A^ 0.2. The suspension (15xl0 7 Bacteria) 

1 5 was then mixed with 1 5x 1 0 5 phage particles to give a ratio of 1 00-bacteria/phage 
particle in the presence of 400 ^ml of CaCl 2 . After incubation of 15 min at room 
temperature, 7.5 ml of melted soft agar were added to the mixture and poured onto the 
surface of 1 50 mm nutrient agar plates and incubated overnight at 37°C. To collect the 
lysate, 20 ml of NB were added to each plate and the soft agar layer was collected by 

20 scrapping off with a clean microscope slide and shaken vigorously for 5 min to break 
up the agar. The mixture was then centrifuged for 10 min at 4,000 rpm (2,830 xg) 
using a J A- 10 rotor (Beckman) and the supernatant (lysate) is collected and subjected 
to a treatment with 10 yg/ml of DNase I and RNase A for 30 min at 37°C To 
precipitate the phage particles, 10% (w/v) of PEG 8000 and 0.5 M of NaCl were 

25 added to the lysate and the mixture was incubated on ice for 16 h. The phage was 
recovered by centrifugation at 4,000 rpm (3,500 xg) for 20 min at 4°C on a GS-6R 
table top centrifuge (Beckman). 

The pellet was resuspended with 2 ml of phage buffer (1 mM MgS0 4 , 5 mM 
MgCl 2 , 80 mM NaCl and 0.1% Gelatin). The phage suspension was extracted with 1 

30 volume of chloroform and further purified by centrifugation on a preformed cesium 
chloride step gradient as described in Sambrook et al. (1989), using a TLS 5fLr6l5r 
and centrifuged in an Optima TLX ultracentrifuge (Beckman) for 2 h at 28,000 rpm 
(67,000 xg) at 4°C. Banded phage was collected and ultracentrifuged again on an 
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isopycnic cesium chloride gradient (1.45 g/ml) at 40,000 rpm (64,000 x g) for 24 h at 
4°C using a TLV rotor (Beckman). The phage was harvested and dialyzed for 4 h at 
room temperature against 4 L of dialysis buffer consisting of 10 mM NaCl, 50 mM 
Tris-HCl [pH 8] and 10 mM MgCl 2 . Phage DNA was prepared from the phage 
5 suspension by adding 20 mM EDTA, 50 ng/ml Proteinase K and 0.5% SDS and 
incubating for 1 h at 65°C, followed by successive extractions with 1 volume of 
phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was 
then dialyzed overnight at 4°C against 4 L of TE (10 mM Tris-HCl [pH 8.0], ImM 
EDTA). 

10 

Example 8. DNA sequencing of the Bacteriophage 44 AHJD genome. 

Four mg of phage DNA was diluted in 200 p.1 of TE pH 8.0 in a 1 .5 ml 
eppendorf tube and sonication was performed (550 Sonic Dismembrator, Fisher 
Scientific). Samples were sonicated under an amplitude of 3 |im with bursts of 5 s 

15 spaced by 15 s cooling in ice/water for 3 to 4 cycles and size fractionated on 1% 
agarose gels. The sonicated DNA was then size fractionated by gel electrophoresis. 
Fractions ranging from 1 to 2 kbp were excised from the agarose gel and purified 
using a coommercial DNA extraction system according to the instructions of the 
manufacturer (Qiagen) and eluted in 50 \i\ of ImMTris-HCl [ pH 8.5]. 

20 The ends of the sonicated DNA fragments were repaired with a combination of 

T4 DNA polymearse and the Klenow fragment of E. coli DNA polymerase 1 as 
follows. Reactions were performed in a final volume of 100 containing DNA, 10 
mM Tris-HCl pH 8.0, 50 mM NaCl, 10 mM MgCl 2 , 1 mM DTT, 5 |ag BSA, 100 \iM 
of each dNTP and 15 units of T4 DNA polymerase (New England Biolabs) for 20 min 

25 at 12°C followed by addition of 12.5 units of Klenow fragment (New England 
Biolabs) for 15 min at room temperature. The reaction was stopped by two 
phenol/chloroform extractions and the DNA was ethanol precipitated and resuspended 
in 20 ^1 of H 2 0. 

Cloning of the sonicated phage DNA into pKSII vector and transformation: 
30 Blunt-ended DNA fragments were cloned by ligation directly into the-Z/mcII ~ 

site of the pkSII vector (Stratagene) dephosphorylated with calf intestinal alkaline 
phosphatase (New England Biolabs). A typical reaction contained 100 ng of vector, 2 
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to 5 fil of repaired sonicated phage DNA (50-100 ng) in a final volume of 20 1*1 
containing 800 units of T4 DNA ligase (New England Biolabs) overnight at 16°C. 
Transformation and selection of positive clones was performed in the host strain 
DH10 p of E. coli using ampicillin as a selective antibiotic as described in Sambrook 
5 etal (1989). 

Recombinant clones were picked from agar plates into 96-well plates 
containing 100 ml LB and 100 |ig/ml ampicillin and incubated at 37°C. The presence 
of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 
flanking the HincU cloning site of the pKS vector. PCR amplification of the potential 

1 0 foreign inserts was performed in a 15 |xl reaction volume containing 10 raM Tris-HCl 
(pH 8.3), 50 mM KC1, 1.5 mM MgCl 2 , 0.02% gelatin, 1 mM primer, 187.5 ^tM each 
dNTP, and 0.75 units Tag polymerase (BRL). The thermocycling parameters were as 
follows: 2 min initial denaturation at 94°C for 2 min, followed by 20 cycles of 30 sec 
denaturation at 94°C, 30 sec annealing at 58C, and 2 min extension at 72°C, followed 

15 by a single extension step at 72°C for 10 min. Clones with insert sizes of 1 to 2 kbp 
were selected and plasmid DNA was prepared from the selected clones using the 
QIAprep™ spin miniprep kit (Qiagen). 

The nucleotide sequence of the extremities of each recombinant clone was determined 
using an ABI 377-36 automated sequencer with two types of chemistry: ABI prism 

20 BigDye™ primer cycle sequencing (21M13 primer: #40305 5)(M13REV primer: 
#403056) or ABI prism BigDye™ terminator cycle sequencing ready reaction kit 
(Applied Biosystems; #4303 152). To ensure co-linearity of the sequence data and the 
genome, all regions of the phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 

25 sequencing primer was selected and phage DNA was used directly as sequencing 

template employing ABI prism BigDye™ terminator cycle sequencing ready reaction 
kit. 

Example 9. Bioinformatic management of primary nucleotide sequence. 
30 Sequence contigs were assembled using Sequencher™ 3.1 software " ~ 

(GeneCodes). To close contig gaps, sequencing primers were selected near the edge of 
the contigs. Phage DNA was used directly as sequencing template employing ABI 
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prism BigDye™ terminator cycle sequencing ready reaction kit (Applied Biosystems; 
#4303152). The complete sequence of Staphylococcus aureus bacteriophage 44AHJD 
is shown in Table 16. 

A software program was used on the assembled sequence of bacteriophage 
5 44AHJD to identify all putative ORFs larger than 33 codons. The software scans the 
primary nucleotide sequence starting at nucleotide #1 for an appropriate start codon. 
Three possible selections can be made for defining the nature of the start codon; I) 
selection of ATG, II) selection of ATG or GTG, and III) selection of either ATG, 
GTG, TTG, CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds 

10 to the one reported by the NCBIfhtt p://www.ncbi.nlm.nih.gov/htbin- 

post/Taxonomv/wprintgc?mode=c> for the bacterial genetic code. When an 
appropriate start codon is encountered, a counting mechanism is employed to count 
the number of codons (groups of three nucleotides) between this start codon and the 
next stop codon downstream of it. If a threshold value of 33 is reached, or exceeded, 

1 5 then the sequence encompassed by these two codons is defined as an ORF. This 

procedure is repeated, each time starting at the next nucleotide following the previous 
stop codon found, in order to identify all the other putative ORFs. The scan is 
performed on all three reading frames of both DNA strands of the phage sequence. 
The predicted ORFs for bacteriophage 44AHJD are listed in Tables 17 & 18. 

20 Sequence homology searches for each ORF were carried out using an 

implementation of blast programs. Downloaded public databases used for sequence 
analysis include: 

(i) non-redundant GenBank (ftp://ncbi.nlm.nih.gOv/blast/db/nr.Z), 
ii) Swissprot (ftp://ncbi.nlm.nih.g0v/blast/db/swisspr0t.Z); 
25 iii) vector (ftp://ncbi.nlm.nih.g0v/blast/db/vect0r.Z); 

iv) pdbaa databases (ftp.7/ncbi.nlm.nih.govMast/db/pdbaa.Z); 

v) Staphylococcus aureus NCTC 8325 (ftp://ftp.genome.ou.edu/pub/staph/staph- 
lk.fa); 

vi) 5/a/?/i}//0cc>am^ 
30 97.Z); 

vii) PRODOM(ftp://ftp.toulouse.inra.fr/pub/prodon^^ 
astgz); 

viii) DOMO (ftp://ftp.infobiogen.fr/pub/db/domo/); 
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ix) TREMBL (ftp://www.expasy.ch/databases/spjr_nrdb/fasta/) 

The results of the homology searches performed on the ORFs of bacteriophage 
44AHJD are shown in Tables 19 & 20. 

5 Example 10. Sub-Cloning of Bacteriophage 44 AHJD ORFs. 

Expression preferably utilizes a shuttle expression vector which is arranged 
such that expression of the exogenous bacteriophage 44 AHJD ORF sequence is 
inducible. For example, the shuttle vector pT0021, in which the firefly luciferase 
QucFF) expression is controlled by the ars (arsenite) promoter/operator (Tauriainen et 

10 al., 1997), can be modified in the following fashion. Two oligonucleotides 

corresponding to a short antigenic peptide derived from the heamaglutinin protein of 
influenza virus (HA epitope tag) were synthesized (Field et al., 1988). The sense 
strand HA tag sequence (with BamHl, Sail and HindBl cloning sites) is: 
5 '-gatcccggtcgaccaagcttTACCCATACGACGTCCCAGACTACGCC AGCTGA-3 • 

15 (where upper case letters denote the nucletotide sequence of the HA tag); the antisense 
strand HA tag sequence (with a HindHl cloning site) is: 

5 '-agctTCAGCTGGCGTAGTCTGGGACGTCGTATGGGTAaagcttggtcgaccgg-3 * 
(where upper case letters denote the sequence of the HA tag). The two HA tag 
oligonucleotides were annealed and ligated into pT0021 vector which had been 

20 digested with BamHl and HindlU. This manipulation resulted in replacement of the 
lucFF gene by the HA tag. This modified shuttle vector containing the arsenite 
inducible promoter, the arsR gene, and HA tag was named pTHA. A diagram 
outlining our modification of pT0021 to generate pTHA is shown in Fig. 1 A (another 
userful vector construct is shown in Fig. IB). 

25 Each ORF, encoded by Bacteriophage 44 AHJD, larger than 33 amino acids 

and having a Shine-Dalgamo sequence upstream of the initiation codon can be 
selected for functional analysis for bacterial inhibition. Each individual ORF, from 
initiation codon to last codon (excluding the stop codon), can be amplified from phage 
genomic DNA using the polymerase chain reaction (PCR). For PCR amplification of 

30 ORFs, each sense strand primer targets the initiation codon and is preceded by a 

BamHl restriction site ( 5 cgggatcc 3 ) and each antisense oligonucleotide targets the~ ~ 
pentultimate codon (the one before the stop codon) of the ORF and is preceded by a 
Sal I restriction site ( 5 'gcgtcgaccg 3 ). The PCR product of each ORF can be gel 
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purified and digested with BamHl and Sail. The digested PCR product can then be 
gel purified using the Qiagen kit as described, ligated into BamHl and Sail digested 
pTHA vector, and used to transform E. coli bacterial strain DH10P(as described 
above). As a result of this manipulation, the HA tag is set inframe with the ORF and is 
5 positioned at the carboxy terminus of each ORF (pTHA/ORF clones). Recombinant 
pTHA/ORF clones will be picked and their insert sizes were confirmed by PCR 
analysis using primers flanking the cloning site. The following primers can be used 
for PCR amplification: HAF: 5 TATTATCCAAAACTTGAACA 3 *; HAR: 
5 CGGTGGTATATCCAGTGATT 3 '. The sequence integrity of cloned ORFs can be 

10 verified directly by DNA sequencing using primers HAF and HAR. In cases where 
verification of ORF sequence can not be achieved by one pass with the sequencing 
primers, additional internal primers will be selected and used for sequencing. 

Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) will be used as 
a recipient for the expression of recombinant plasmids. Electoporation will be 

1 5 performed essentially as previously described (Schenk and Laddaga, 1 992). Selection 
of recombinant clones will be performed on Luria-Broth agar (LB-agar) plates 
containing 30 |ig/ml of kanamycin. 

Alternatively, a constitutive promoter can be used to drive expression of the 
introduced ORF, and compare cell growth to control bacterial cells containing the 

20 parental vector lacking any introduced phage ORF. Recombinant plasmids will be 
introduced into Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) using 
electoporation as previously described (Schenk and Laddaga, 1992). 
Cloning of ORFs with a ShiMe-Dalgarao sequence 

ORFs with a Shine-Dalgarno sequence are selected for functional analysis of 

25 bacterial killing. Each ORF, from initiation codon to last codon (excluding the stop 
codon), can be amplified by PCR from phage genomic DNA. For PCR amplification 
of ORFs, each sense strand primer starts at the initiation codon and is preceded by a 
restriction site and each antisense strand starts at the last codon (excluding the stop 
codon) and is preceded by a different restriction site. The PCR product of each ORF 

30 will be gel purified and digested with the restriction enzymes with sites contained on 
the PCR oligonucleotides. The digested PCR product is then gel purified using4he ~ 
Qiagen kit, ligated into the modified shuttle vector, and used to transform bacterial 
strain DH10. Recombinant clones are then picked and their insert sizes confirmed by 
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PCR analysis using primers flanking the cloning site as well as restriction digestion. 
The sequence fidelity of cloned ORFs can be verified by DNA sequencing using the 
same primers as used for PCR. In the cases that the verification of ORFs can not be 
achieved by one path of sequencing using primers flanking the cloning site internal 
5 primers can be selected and used for sequencing. Recombinant plasmids can be 

introduced into Staphylococcus aureus strain RN4220 (Kreiswirth et al, 1983) using 
electoporation as previously described (Schenk and Laddaga, 1992). 
Induction of gene expression from the ars promoter. 

If an inducible promoter is used, e.g., the ars promoter, induction can be 
1 0 assessed, for example, in either of the two methods. 

1, Sparing on agar plates 

The functional identification of killer ORFs can be performed by spreading an aliquot 
of S. aureus transformed cells containing phage 44 AHJD ORFs onto agar plates 
containing different concentrations of sodium arsenite (0; 2.5; 5; and 7.5 fiM). The 
1 5 plates are incubated overnight at 37°C, after which a growth inhibition of the ORF 
transformants on plates that contain arsenite are compared to plates without arsenite. 

2. Quantification of growth inhibition in liquid medium 

Cells containing different recombinant plasmids can be grown for overnight at 
37°C in LB medium supplemented with the appropriate antibiotic selection. These are 

20 then diluted to the mid log phase (OD^.2) with fresh media containing antibiotic 
and transferred to 96-well microtitration plates (100 (il/well). Inducer is then added at 
different final concentrations (ranging from 2.5 to 10 pM) and the culture incubated 
for an additional 2 hrs at 37°C. The effect of expression of the phage 44 AHJD ORFs 
on bacterial cell growth is then monitored by measuring the OD 540 and comparing the 

25 rate of growth to the culture not containing inducer. [As positive controls for growth 
inhibition, the kilA gene of phage lambda (Reisinger, GR., Rietsch, A., Lubitz, W. and 
Blasi, U. 1993 Virology #193: 1033-1036), and the holin/lysin genes of the 
Staphylococcus aureus phage Twort (Loessner, MJ., Gaeng, S., Wendlinger, G., 
Maier, SK. and Scherer, S. 1998. FEMS Microbiology Letters #162:265-274) can be 

30 subcloned into the ars inducible vector. An aliquot of the induced and uninduced. „ 
culture can also be plated out on agar plates containing an appropriate antibiotic- 
selection but lacking inducer. Following incubation overnight at 37°C, the number of 
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colonies is counted. Any ORF showing bacteriostatic activity will show a lower, but 
detectable, number of colonies on the agar plates when grown in the presence of 
inducer as compared to when grown in the absence of inducer. Any ORF showing 
full bacteriocidal activity will show no colonies on the agar plates, when grown in the 
5 presence of inducer as compared to when grown in the absence of inducer. 
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Example 1 1 . Growth of Enterococcus bacteriophage 1 82 and purification of 
genomic PNA. 

The Enterococcus propagating strain (PS) {Enterococcus sp. Group D, Felix 
d'Herelle Reference Centre #HER 1080) was used as host to propagate its respective 

10 phage 182 (Felix d'Herelle Reference Centre #HER 80). Two rounds of plaque 
purification of phage 182 were performed on soft agar essentially as described in 
Sambrook et ah (1 989). Briefly, the Enterococcus sp. PS strain was grown overnight 
at 37°C in Tryptic Soy Broth [TSB: 17 g Bacto tryptone, 3 g Bacto soytone, 2.5 g 
Bacto dextrose, 5 g Sodium chloride, and 2.5 g Dipotassium phosphate per liter 

15 (Difco Laboratories (#0370-17-3)]. The culture was then diluted 20 fold in TSB and 
incubated at 37°C until the OD 540 = 0.2 (early log phase) with constant agitation. In 
order to obtain single plaques, phage 182 was subjected to 10 fold serial dilutions 
using the phage buffer (1 mM MgS0 4 , 5 mM MgCl 2 , 80 mM NaCl and 0.1% Gelatin 
(w/v)) and 10 1 of each dilution was used to infect 0.5 ml of the bacterial cell 

20 suspension. After incubation at 15 min at 37°C, 2 ml of melted soft agar (TSB 

supplemented with 0.6% agar) was added to the mixture and poured onto the surface 
of 100 mm Trytic Soy Agar plates [TSA: 15 g Tryptone peptone, 5 g Soytone 
peptone, 5 g Sodium chloride and 15 g of Agar per liter (Difco Laboratories #0369- 
17)]. After overnight incubation at 37°C, a single plaque was isolated, resuspended in 

25 1 ml of phage buffer by end over end rotation for 2 hrs at room temperature, and the 
phage suspension was diluted and used for a second infection as described above. 
After overnight incubation at 37°C, a single plaque was isolated and used as a stock 
for all subsequent manipulations. 

The propagation procedure for bacteriophage 182 was modified from the agar 

30 layer method of SwanstSrm and Adams ( 1 95 1 ). Briefly, the Enterococcus sp. PS 

strain was grown to stationary phase overnight at 37°C in TSB. The culture was then - •* 
diluted 20 fold in TSB and incubated at 37°C until the A 540 = 0.2. The suspension 
(15xl0 7 Bacteria) was then mixed with 15xl0 5 plaque forming units (pfu) to give a 
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ratio of 100-bacteria/pfu. After incubation of 15 min at 37°C, 7.5 ml of melted soft 
agar (TSB plus 0.6% agar) were added to the mixture and poured onto the surface of 
150 mm TSA plates and incubated 16 hrs at 37°C. To collect the plate lysate, 20 ml 
of TSB were added to each plate and the soft agar layer was collected by scrapping off 
5 with a clean microscope slide followed by vigorous shaking of the agar suspension for 
5 min to break up the agar. The mixture was then centrifuged for 10 min at 4,000 rpm 
(2,830 xg) using a JA-10 rotor (Beckman) and the supernatant fluid (lysate) is 
collected and subjected to a treatment with 10 ^ig /ml of DNase I and RNase A for 30 
min at 37°C. To precipitate the phage particles, the phage suspension was adjusted to 

10 10% (w/v) of PEG 8000 and 0.5 M of NaCl followed by incubation at 4°C for 16 hrs. 
The phage was recovered by centrifugation at 4,000 rpm (3,500 xg) for 20 min at 4°C 
on a GS-6R table top centrifuge (Beckman). The pellet was resuspended with 2 ml of 
phage buffer (1 mM MgS0 4 , 5 mM MgCl 2 , 80 mM NaCl and 0.1% Gelatin). The 
phage suspension was extracted with 1 volume of chloroform and further purified by 

15 centrifugation on a cesium chloride step gradient as described in Sambrook et al 
(1989), using a TLS 55 rotor and centrifuged in an Optima TLX ultracentrifuge 
(Beckman) for 2 hrs at 28,000 rpm (67,000 xg) at 4°C. Banded phage was collected 
and ultracentrifuged again on an isopycnic cesium chloride gradient (1,45 g/ml) at 
40,000 rpm (64,000 xg) for 24 hrs at 4°C using a TLV rotor (Beckman). The phages 

20 were harvested and dialyzed for 4 hrs at room temperature against 4 L of dialysis 
buffer consisting of 10 mM NaCl, 50 mM Tris-HCl [pH 8] and 10 mM MgCl 2 . Phage 
DNA was prepared from the phage suspension by adding 20 mM EDTA, 50 g/ml 
Proteinase K and 0.5% SDS and incubating for 1 hr at 65°C, followed by successive 
extractions with 1 volume of phenol, 1 volume of phenol-chloroform and 1 volume of 

25 chloroform. The DNA was then dialyzed overnight at 4°C against 4 L of TE (10 mM 
Tris-HCl [pH 8.0], ImM EDTA). 

Example 1 2. DNA sequencing of the Bacteriophage 1 82 genome. 

Four micrograms of phage DNA was diluted in 200 \xl of TE (10 mM Tris, 
30 [pH 8.0], 1 mM EDTA) in a 1 .5 ml eppendorf tube and sonication was performed 
(550 Sonic Dismembrator, Fisher Scientific). Samples were sonicated under aiu 
amplitude of 3 jam with bursts of 5 s spaced by 15 s cooling in ice/water for 3 to 4 
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cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% 
agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acetate, 1 mM EDTA [pH 8.0]) 
as the running buffer. Fractions ranging from 1 to 2 kbp were excised from the 
agarose gel and purified using a commercial DNA extraction system according to the 
5 instructions of the manufacturer (Qiagen), with a final elution of 50 |il of 1 mM Tris 
[pH8.5]. 

The ends of the sonicated DNA fragments were repaired with a combination of 
T4 DNA polymerase and the Klenow fragment of E. coli DNA polymerase I, as 
follows. Reactions were performed in a reaction mixture (final volume, 100 \xl) 

10 containing sonicated phage DNA, 10 mM Tris-HCl [pH 8.0], 50 mM NaCl, 10 mM 
MgCl 2 , 1 mM DTT, 50 ng/ml BSA, 100 ^M of each dNTP and 15 units of T4 DNA 
polymerase (New England Biolabs) for 20 min at 12°C followed by addition of 12.5 
units of the Klenow large fragment of DNA polymerase I(New England Biolabs) for 
15 min at room temperature. The reaction was stopped by two phenol/chloroform 

1 5 extractions and the DNA was precipitated with ethanol and the final DNA pellet 
resuspended in 20 ^1 of H 2 0. 

Blunt-ended DNA fragments were cloned by ligation directly into the Hinc II 
site of the pKSH+ vector (New England Biolabs) dephosphoryiated by treatment with 
calf intestinal alkaline phosphatase (New England Biolabs). A typical ligation reaction 

20 contained 100 ng of vector DNA, 2 to 5 ^1 of repaired sonicated phage DNA (50-100 
ng) in a final volume of 20 ^1 containing 800 units of T4 DNA ligase (New England 
Biolabs) and was incubated overnight at 16°C. Transformation and selection of 
bacterial clones containing recombinant plasmids was performed in E. coli DHlOp 
according to standard procedures (Sambrook et aL, 1989). 

25 Recombinant clones were picked from agar plates into 96-well plates 

containing 100 |al LB and 100 \xg/ml ampicillin and incubated at 37°C. The presence 
of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 
flanking the Hinc II cloning site of the pKS vector. PCR amplification of the potential 
foreign inserts was performed in a 15 \il reaction volume containing 10 mM Tris (pH 

30 8.3), 50 mM KC1, 1 .5 mM MgCl 2 , 0.02% gelatin, 1 ^M primer, 1 87.5 >iM eath dNTPr 
and 0.75 units Taq polymerase (BRL). The thermocycling parameters were as 
follows: 2 min initial denaturation at 94°C for 2 min, followed by 20 cycles of 30 sec 
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denaturation at 94°C, 30 sec annealing at 58°C, and 2 min extension at 72°C, 
followed by a single extension step at 72°C for 10 min. Clones with insert sizes of 1 
to 2 kbp were selected and plasmid DNA was prepared from the selected clones using 
the QIAprep™ spin miniprep kit (Qiagen). 
5 The nucleotide sequence of the extremities of each recombinant clone was 

determined using an ABI 377-36 automated sequencer with two types of chemistry: 
ABI prism Big Dye™ primer cycle sequencing (21M13 primer: #403055)(M13REV 
primer: #403056) or ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit (Applied Biosystems; #4303152). To ensure co-linearity of the sequence data and 
10 the genome, all regions of the phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 
sequencing primer was selected and phage DNA was used directly as sequencing 
template employing ABI prism BigDye™ terminator cycle sequencing ready reaction 
kit. 

15 

Exampie 13. Bioinformatic management of primary nucleotide sequence. 

Sequence contigs were assembled using Sequencher™ 3.1 software 
(GeneCodes). To close contig gaps, sequencing primers were selected near the edge of 
the contigs. Phage DNA was used directly as sequencing template employing ABI 
20 prism BigDye™ terminator cycle sequencing ready reaction kit (Applied Biosystems; 
#4303152). The complete sequence of Enterococcus bacteriophage 182 is shown in 
Table 21. 

A software program was used on the assembled sequence of bacteriophage 182 
to identify all putative ORFs larger than 33 codons. The software scans the primary 

25 nucleotide sequence starting at nucleotide #1 for an appropriate start codon. Three 
possible selections can be made for defining the nature of the start codon; I) selection 
of ATG, n) selection of ATG or GTG, and III) selection of either ATG, GTG, TTG, 
CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds to the one 
reported by the NCBI fhttp://www.ncbi.nlm.nih.gov/htbin- 

30 post/Taxonomv/wprintgc?mode=c) for the bacterial genetic code. When an 

appropriate start codon is encountered, a counting mechanism is employed to count " 
the number of codons (groups of three nucleotides) between this start codon and the 
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next stop codon downstream of it. If a threshold value of 33 is reached, or exceeded, 
then the sequence encompassed by these two codons is defined as an ORF. This 
procedure is repeated, each time starting at the next nucleotide following the previous 
stop codon found, in order to identify all the other putative ORFs. The scan is 
5 performed on all three reading frames of both DNA strands of the phage sequence. 
The predicted ORFs for bacteriophage 182 are listed in Tables 22 & 23. 
Sequence homology searches for each ORF were carried out using an implementation 
of BLAST programs. Downloaded public databases used for sequence analysis 
include: 

1 0 (i) non-redundant GenBank (ftp://ncbiinlm.nih.gOv/blast/db/nr.Z), 

ii) Swissprot (ftp://ncbi.nlm.nih.gov^last/db/swissprot.Z); 

iii) vector (ftp://ncbi.nlm.nih.gOv/blast/db/vector.Z); 

iv) pdbaa databases (ftp://ncbi.nlm.nih.gov^last/db/pdbaa.Z); 

v) staphylococcus aureus NCTC 8325 (ftp://ftp.genome.ou.edu/pub/staph/staph- 
15 lk.fa); 

vi) streptococcus pyrogenes 

(ftp://ftp.tigr.org/pub/data/sjpneumoniae/gsp.contigs. 1121 97.Z); 

vii) PRODOM 

fftp://ftp.toulouse.inra.fr/pub/prodom/current release/prodom99.1.forblast.gz) : 
20 viii) DOMO (ftD://ftD.infobiogen.fr/Dub/db/domo/ ): 

ix) TREMBL (ftp://www.expasy.ch/databases/sp_tr_nrdb/fasta/) 

The results of the homology searches performed on the ORFs of bacteriophage 
1 82 are shown in Tables 24 & 26. 

25 Example 14. Sub-Cloning of Bacteriophage 182 ORFs. 
PreparatioE off tthe shuttle expression vector 

Expression preferably utilizes a shuttle expression vector which is arranged 
such that expression of the exogenous bacteriophage 182 ORF sequence is inducible. 
For example, the plasmid pND50 replicates in E. co!i, E.faecalis, and S. aureus 

30 (Yamagishi, J., Kojima, T., Oyamada, Y., Fujimoto, K., Hattori, H., Nakamura, S., 

and Inoue, M. 1996. Antimocrob. Agents Chemother. 40, 1 157-1 163). This plasmid-- * 
can be modified by conventional techniques to insert the inducible arsenite promoter, 
derived from the shuttle vector pT0021, in which the firefly luciferase (lucFF) 
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expression is controlled by the ars promoter/operator from a S. aureus plasmid 
(Tauriainen, S., Karp, M, Chang, W and Virta, M. (1997). Recombinant luminescent 
bacteria for measuring bioavailable arsenite and antimonite. Appl. Environ. Microbiol 
63:4456-4461). This modified shuttle vector will contain the ars promoter, arsR gene 
5 and a cloning site for introduction of individual phage ORFs downstream from a 
shine-delgarno sequence. 

Other inducible regulatory sequences can be utilized instead of the arsenite- 
inducible system. An example is a nisin-inducible system The nisA promoter activity 
is dependent on the proteins NisR and NisK, which constitute a two-component signal 

10 transduction system that responds to the extracellular inducer nisin. The nisin 

sensitivity and inducer concentration required for maximal induction varies among the 
strains, but is functional in Streptococcus pyogenes. Streptococcus agalactiae 9 
Streptococcus pneumoniae, Enterococcus faecalis, and Bacillus subtilis. Significant 
induction of the nisA promoter (10- to 60-fold induction) can be obtained in all of the 

15 species. A vector containing this promoter was published as Eichenbaum Z, Federle 
MJ, Manra D, de Vos WM, Kuipers OP, Kleerebezem M, and Scott JR (1998) Appl 
Environ Microbiol 64, 2763-2769. Other vectors, e.g., plasmids, can also be utilized 
which will allow replication and transciption in Enterococcus. 

Alternatively, a constitutive promoter can be used (e.g„ the P-lactamase 

20 promoter is constitutive in E. faecalis - see ref. 1) to drive expression of the 

introduced ORF, and compare cell growth to control bacterial cells containing the 
parental vector lacking any introduced phage ORF. Recombinant plasmids are 
introduced into E. faecalis strain FA2-2 by electroporation, as previously described 
(Yamagishi, J., Kojima, T., Oyamada, Y., Fujimoto, K., Hattori, H., Nakamura, S., 

25 and Inoue, M. 1996. Antimicrob. Agents Chemother. 40, 1157-1 163). 
doming of ORFs with a Sfaine-Dalgarno seqasesnce 

ORFs with a Shine-Dalgarno sequence are selected for fimctional analysis of 
bacterial killing. Each ORF, from initiation codon to last codon (excluding the stop 
codon), will be amplified by PCR from phage genomic DNA. For PCR amplification 

30 of ORFs, each sense strand primer starts at the initiation codon and is preceded by a 
restriction site and each antisense strand starts at the last codon (excluding the stppr* * 
codon) and is preceded by a different restriction site. The PCR product of each ORF 
will be gel purified and digested with the restriction enzymes with sites contained on 
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the PCR oligonucleotides. The digested PCR product is then gel purified using the 
Qiagen kit, ligated into the modified shuttle vector, and used to transform bacterial 
strain DHlOp. Recombinant clones are then picked and their insert sizes confirmed by 
PCR analysis using primers flanking the cloning site as well as restriction digestion. 
5 The sequence fidelity of cloned ORFs will be verified by DNA sequencing using the 
same primers as used for PCR. In the cases that the verification of ORFs can not be 
achieved by one path of sequencing using primers flanking the cloning site internal 
primers will be selected and used for sequencing. Recombinant plasmids will be 
introduced into E. faecalis strain FA2-2 by electroporation, as previously described 

10 (Yamagishi, J., Kojima, T., Oyamada, Y., Fujimoto, K., Hattori, H., Nakamura, S., 
and Inoue, M 1996. Antimicrob. Agents Chemother. 40, 1157-1163). 
Induction of gene expression from the ars promoter. 

If an inducible promoter is used, e.g., the ars promoter, induction can be 
assessed, for example, in either of the two methods. 

15 1 . Screening on agar p lates 

The functional identification of killer ORFs can be performed by spreading an aliquot 
of E. faecalis transformed cells containing phage 182 ORF onto agar plates containing 
different concentrations of sodium arsenite (0; 2.5; 5; and 7.5 ^iM). The plates are 
incubated overnight at 37°C, after which a growth inhibition of the ORF 

20 transformants on plates that contain arsenite are compared to plates without arsenite. 
2. Quantification of g rowth inhibition in liquid medium 

Cells containing different recombinant plasmids can be grown for overnight at 
37°C in LB medium supplemented with the appropriate antibiotic selection. These are 
then diluted to the mid log phase (OD^.2) with fresh media containing antibiotic 

25 and transferred to 96-well microtitration plates (100 |il/well). Inducer is then added at 
different final concentrations (ranging from 2.5 to 10 \iM) and the culture incubated 
for an additional 2 h at 37°C. The effect of expression of the phage 182 ORFs on 
bacterial cell growth is then monitored by measuring the OD^ and comparing the rate 
of growth to the culture not containing inducer. As positive controls for growth 

30 inhibition, the kilA gene of phage lambda (Reisinger, GR., Rietsch, A., Lubitz, and , , 
Blasi, U. 1993 Virology #193: 1033-1036), and the holin/lysin genes T>f~the 
Staphylococcus aureus phage Twort (Loessner, MJ., Gaeng, S., Wendlinger, G., 
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Maier, SK. and Scherer, S. 1998. FEMS Microbiology Letters #162:265-274) were 
subcloned into the ars inducible vector. An aliquot of the induced and uninduced 
culture can also be plated out on agar plates containing an appropriate antibiotic 
selection but lacking inducer. Following incubation overnight at 37°C, the number of 
5 colonies is counted. Any ORF showing bacteriostatic activity will show a lower, but 
detectable, number of colonies on the agar plates when grown in the presence of 
inducer as compared to when grown in the absence of inducer. Any ORF showing 
bacteriocidal activity will show no colonies on the agar plates, when grown in the 
presence of inducer as compared to when grown in the absence of inducer. 

10 
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Example 15. Growth of Streptococcus bacteriophage Dp-1 and purification of 
genomic DNA . 

The Streptococcus pneumoniae R6 propagating strain (PS) (Tomasz, 
1966) was used as host to propagate its respective phage Dp-1 (McDonnell et al., 

25 1975). (Alternatively, Streptococcus (Diplococcus) pneumoniae R36A could be used. 
Strain R36A is available from ATCC as #1 1733 or 27336. Streptococcus pneumoniae 
is also available from Felix d'Herelle Reference Center in Quebec, Canada as catalog 
number HER 1054. Other S. pneumoniae strains are also available from ATCC.) 
Two rounds of plaque purification of phage Dp-1 were performed on soft agar 

30 essentially as described in Sambrook et al. (1989). Briefly, the Streptococcus R6 PS 

strain was grown overnight at 37°C in K-Cat media [K-Cat:. 1 0 g Bacto casitone, 5g _ - 
Bacto tryptone, 1 g Yeast extract, 5g Potassium chloride, 0.2% Glucose, 30mM* 
Potassium phosphate buffer [pH 8] and 250,000 Units Catalase per liter (Boehringer 
Mannheim #10683600). The culture was then diluted 20 fold in K-CAT and 
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incubated at 37°C until the OD 540 = 0.2 (early log phase) with constant agitation. In 
order to obtain single plaques, Dp-1 phage was subjected to 10-fold serial dilutions 
using the phage buffer (100 mM Tris-HCl [pH 7.5], 100 mM NaCl and 10 mM 
MgCl 2 )and 10 \xl of each dilution was used to infect 0.5 ml of the cell suspension. 
5 After incubation of 15 min at 37°C, 2 ml of melted soft agar (K-CAT supplemented 
with 0.8% of agar) were added to the mixture and poured onto the surface of 100 mm 
K-CAT agar plates [K-CAT supplemented with 1.2 % of agar]. After solidification of 
the soft agar layer, an additional 5 ml of melted soft agar was added to visualize 
distinct plaques (Ronda et ah, 1978). After overnight incubation at 37°C, a single 

10 plaque was isolated, resuspended in 1 ml of phage buffer by end over end rotation for 
2 hrs at room temperature, and the phage suspension was diluted and used for a 
second infection as described above. After overnight incubation at 37°C, a single 
plaque was isolated and used as a stock for all subsequent manipulations. 

The propagation procedure for bacteriophage Dp-1 was modified from the 

1 5 agar layer method of S wanstSrm and Adams (1951). Briefly, the R6 strain of 

Streptococcus pneumoniae was grown to stationary phase overnight at 37°C in K- 
CAT. The culture was then diluted 20 fold in K-CAT and incubated at 37°C until the 
OD 540 = 0.2. The suspension (15xl0 7 Bacteria) was then mixed with 15xl0 5 plaque 
forming units (pfu) to give a ratio of 100-bacteria/pfu. After incubation of 15 min at 

20 37°C, 7.5 ml of melted soft agar (K-CAT plus 0.8% agar) were added to the mixture 
and poured onto the surface of 150 mm K-CAT agar plates and incubated 16 hrs at 
37°C. After solidification of the soft agar layer, 7.5 ml of melted soft agar were added 
to each plate. To collect the plate lysate, 20 ml of K-CAT media were added to each 
plate and the soft agar layers were collected by scrapping off with a clean microscope 

25 slide followed by vigorous shaking of the agar suspension for 5 min to break up the 
agar. The mixture was then centrifuged for 10 min at 4,000 rpm (2,830 xg) using a 
JA-10 rotor (Beckman) and the supernatant (lysate) was collected and subjected to a 
treatment with 10 ng /ml of DNase I and RNase A for 30 min at 37°C. To precipitate 
the phage particles, the phage suspension was adjusted to 10% (w/v) of PEG 8000 and 

30 0.5 M of NaCl followed by incubation at 4°C for 1 6 hrs. The phage was recovered by 
centrifugation at 4,000 rpm (3,500 xg) for 20 min at 4°C on a GS-6R table top 
centrifuge (Beckman). The pellet was resuspended with 2 ml of phage buffer (100 
mM Tris-HCl [pH 7.5], 100 mM NaCl and 10 mM MgCl 2 ). The phage suspension 
was extracted with 1 volume of chloroform and further purified by centrifugation on a 

35 cesium chloride step gradient as described in Sambrook et aL (1989), using a TLS-55 *' 
rotor and centrifuged in an Optima TLX ultracentrifuge (Beckman) for 2 hrs at 28,000 
rpm (67,000 xg) at 4°C. Banded phage was collected and ultracentrifuged again on an 
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isopycnic cesium chloride gradient (1 .45 g/ml) at 40,000 rpm (64,000 xg) for 24 hrs at 
4°C using a TLV rotor (Beckman). The phage was harvested and dialyzed for 4 hrs at 
room temperature against 4 L of dialysis buffer consisting of 10 mM NaCl, 50 mM 
Tris-HCl [pH 8] and 10 mM MgCl 2 . Phage DNA was prepared from the phage 
5 suspension by adding 20 mM EDTA, 50 ng/ml Proteinase K and 0.5% SDS and 
incubating for 1 hr at 65°C, followed by successive extractions with 1 volume of 
phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was 
then dialyzed overnight at 4°C against 4 L of TE (10 mM Tris-HCl [pH 8.0], ImM 
EDTA). 

10 

Example 16, DNA sequencing of the Bacteriophage Dd-1 genome. 

Four micrograms of phage DNA was diluted in 200 ^1 of TE (10 mM Tris, 
[pH 8.0], 1 mM EDTA) in a 1 .5 ml eppendorf tube and sonication was performed 
(550 Sonic Dismembrator, Fisher Scientific). Samples were sonicated under an 

15 amplitude of 3 ^im with bursts of 5 sec spaced by 15 sec cooling in ice/water for 3 to 4 
cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% 
agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acetate, 1 mM EDTA [pH 8.0]) 
as the running buffer. Fractions ranging from 1 to 2 kbp were excised from the 
agarose gel and purified using a commercial DNA extraction system according to the 

20 instructions of the manufacturer (Qiagen), with a final elution of 50 \il of 1 mM Tris 
[pH 8.5]. 

The ends of the sonicated DNA fragments were repaired with a combination of 
T4 DNA polymerase and the Klenow fragment of E. coli DNA polymerase I, as } 
follows. Reactions were performed in a reaction mixture (final volume, 100 

25 containing sonicated phage DNA, 10 mM Tris-HCl [pH 8.0], 50 mM NaCl, 10 mM 
MgCl 2 , 1 mM DTT, 50 ^ig/ml BSA, 100 ^M of each dNTP and 15 units of T4 DNA 
polymerase (New England Biolabs) for 20 min at 12°C followed by addition of 12.5 
units of the Klenow large fragment of DNA polymerase I (New England Biolabs) for 
15 min at room temperature. The reaction was stopped by two phenol/chloroform 

30 extractions and the DNA was precipitated with ethanol and the final DNA pellet 
resuspended in 20 \il of H 2 0. 

Blunt-ended DNA fragments were cloned by ligation directly into the Hinc II 
site of the pKSII+ vector (New England Biolabs) dephosphorylated by treatment with 
calf intestinal alkaline phosphatase (New England Biolabs). A typical ligation 

35 reaction contained 100 ng of vector DNA, 2 to 5 \x\ of repaired sonicated phage X>NA * 
(50-100 ng) in a final volume of 20 fil containing 800 units of T4 DNA Iigase (New 
England Biolabs) and was incubated overnight at 16°C. Transformation and selection 
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of bacterial clones containing recombinant plasmids was performed in E. coli DH1O0 
according to standard procedures (Sambrook et al, 1989). 

Recombinant clones were picked from agar plates into 96-well plates 
containing 100 ^1 LB and 100 ng/ml ampicillin and incubated at 37°C The presence 
5 of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 
flanking the Hinc II cloning site of the pKS vector. PCR amplification of the potential 
foreign inserts was performed in a 15 nl reaction volume containing 10 mM Tris (pH 
8.3), 50 mM KC1, 1.5 mM MgCl 2 , 0.02% gelatin, 1 ^iM primer, 187.5 \iM each dNTP, 
and 0.75 units Taq polymerase (BRL). The thermocycling parameters were as 

1 0 follows: 2 min initial denaturation at 94°C for 2 min, followed by 20 cycles of 30 sec 
denaturation at 94°C, 30 sec annealing at 58°C, and 2 min extension at 72°C, 
followed by a single extension step at 72°C for 10 min. Clones with insert sizes of 1 
to 2 kbp were selected and plasmid DNA was prepared from the selected clones using 
the QIAprep™ spin miniprep kit (Qiagen). 

15 The nucleotide sequence of the extremities of each recombinant clone was 

determined using an ABI 377-36 automated sequencer with two types of chemistry: 
ABI prism Big Dye™ primer cycle sequencing (21M13 primer: #40305 5)(M13REV 
primer: #403056) or ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit (Applied Biosystems; #4303152). To ensure co-linearity of the sequence data and 

20 the genome, all regions of the phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 
sequencing primer was selected and phage DNA was used directly as sequencing 
template employing ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit. 

25 

Example 17. Bioinformatic management of primary nucleotide sequence, 

Sequence contigs were assembled using Sequencher™ 3.1 software 
(GeneCodes). To close contig gaps, sequencing primers were selected near the edge 
of the contigs. Phage DNA was used directly as sequencing template employing ABI 
30 prism BigDye™ terminator cycle sequencing ready reaction kit (Applied Biosystems; 
#4303152). The complete sequence of Streptococcus bacteriophage Dp-1 is shown in 
Table 28. 

A software program was used on the assembled sequence of bacteriophage 
Dp-1 to identify all putative ORFs larger than 33 codons. The software scans the 
35 primary nucleotide sequence starting at nucleotide #1 for an appropriate start codonr 
Three possible selections can be made for defining the nature of the start codon; I) 
selection of ATG, II) selection of ATG or GTG, and III) selection of either ATG, 
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GTG, TTG, CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds 
to the one reported by the NCBI fhttp://www.ncbi.nlm.nih.gov/htbin- 
post/Taxonomv/wprintgc?mode=c) for the bacterial genetic code. When an 
appropriate start codon is encountered, a counting mechanism is employed to count 
5 the number of codons (groups of three nucleotides) between this start codon and the 
next stop codon downstream of it. If a threshold value of 33 is reached, or exceeded, 
then the sequence encompassed by these two codons is defined as an ORF. This 
procedure is repeated, each time starting at the next nucleotide following the previous 
stop codon found, in order to identify all the other putative ORFs. The scan is 

1 0 performed on all three reading frames of both DNA strands of the phage sequence. 
The predicted ORFs for bacteriophage Dp-1 are listed in Tables 29 and 30, and Fig. 6. 

Sequence homology searches for each ORF were carried out using an 
implementation of BLAST programs. Downloaded public databases used for 
sequence analysis include: 

1 5 (i) non-redundant GenBank (ftp://ncbi.nlm.nih.gov/blast/db/nrJZ), 

ii) Swissprot (ftp://ncbi.nlm.nih.gov/blast/db/swissprotZ); 

iii) vector (ftp://ncbi.nIm.nih.gOv/blast/db/vector.Z); 

iv) pdbaa databases (ftp://ncbi.nlm.nih.gOv/blast/db/pdbaa.Z); 

v) staphylococcus aureus NCTC 8325 
20 (ftp://ftp.genome.ou.edu/pub/staph/staph-lk.fa); 

vi) streptococcus pyogenes 
(ftp://ftp.tigr.org/pub/data/s_pneumoniae/gsp.contigs. 1121 97.Z); 

vii) PRODOM 

( ftp://ftp.toulouse.inra.fr/pub/prodorri/current release/prodom99. 1 .forblastgz) : 
25 viii) DOMO (ftp://ftD.infobiogen.fr/pub/db/domo/ ): 

ix) TREMBL (ftp://www.expasy.ch/databases/spjr_nrdb/fasta/) 

The results of the homology searches performed on the ORFs of bacteriophage 
Dp-1 are shown in Table 31. 

30 

Example 18. Sub-Cloning of Bacteriophage Dp-1 ORFs, 
Preparation of the shuttle expression vector 

Expression preferably utilizes a shuttle expression vector which is arranged 
such that expression of the exogenous bacteriophage Dp-1 ORF sequence is inducible. „ , 
35 For example, the plasmid pLSE4 replicates in £. coli, and S. pneumoniae (Diaz- and 
Garcia, 1990). This plasmid can be modified by conventional techniques to insert the 
inducible arsenite promoter, derived from the shuttle vector pT0021, in which the 
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firefly luciferase (lucFF) expression is controlled by the ars promoter/operator from a 
S. aureus plasmid (Tauriainen, S., Karp, M., Chang, W and Virta, M. (1997). 
Recombinant luminescent bacteria for measuring bioavailable arsenite and antimonite. 
Appl. Environ. Microbiol. 63:4456-4461). This modified shuttle vector will contain 
5 the ars promoter, arsR gene and a cloning site for introduction of individual phage 
ORFs downstream from a shine-dalgarno sequence. 

Other inducible regulatory sequences can be utilized instead of the arsenite- 
inducible system. An example is a nisin-inducible system The nisA promoter activity 
is dependent on the proteins NisR and NisK, which constitute a two-component signal 

10 transduction system that responds to the extracellular inducer nisin. The nisin 

sensitivity and inducer concentration required for maximal induction varies among the 
strains, but is functional in Streptococcus pyogenes, Streptococcus agalactiae, 
Streptococcus pneumoniae, Enterococcus faecalis, and Bacillus subtilis. Significant 
induction of the nisA promoter (10- to 60-fold induction) can be obtained in all of the 

15 species. A vector containing this promoter was published as Eichenbaum Z, Federle 
MJ, Marra D, de Vos WM, Kuipers OP, Kleerebezem M, and Scott JR (1998) Appl 
Environ Microbiol 64, 2763-2769. Other vectors, e.g., plasmids, can also be utilized 
which will allow replication and transcription in Streptococcus. 

Alternatively, a constitutive promoter can be used to drive expression 

20 of the introduced ORF, and compare cell growth to control bacterial cells containing 
the parental vector lacking any introduced phage ORF. Recombinant plasmids are 
introduced into S. pneumoniae R6 as previously described (Diaz and Garcia, 1990) 

Cloning of ORFs with a Shine-Dalgarno sequence 

25 ORFs with a Shine-Dalgarno sequence are selected for functional analysis of 

bacterial killing. Each ORF, from initiation codon to last codon (excluding the stop 
codon), will be amplified by PCR from phage genomic DNA. For PCR amplification 
of ORFs, each sense strand primer starts at the initiation codon and is preceded by a 
restriction site and each antisense strand starts at the last codon (excluding the stop 

30 codon) and is preceded by a different restriction site. The PCR product of each ORF 
will be gel purified and digested with the restriction enzymes with sites contained on 
the PCR oligonucleotides. The digested PCR product is then gel purified using the 
Qiagen kit, ligated into the modified shuttle vector, and used to transform bacterial 
strain DHlOp. Recombinant clones are then picked and their insert sizes confirmed 

35 by PCR analysis using primers flanking the cloning site as well as restriction* _^ - 
digestion. The sequence fidelity of cloned ORFs will be verified by DNA sequencing 
using the same primers as used for PCR. In the cases that the verification of ORFs 
can not be achieved by one path of sequencing using primers flanking the cloning site 
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internal primers will be selected and used for sequencing. Recombinant plasmids will 
be introduced into S. pneumoniae R6 as previously described (Diaz and Garcia, 1990). 
leducttioini of gene expression! from the ars promniotter. 

If an inducible promoter is used, e.g., the ars promoter, induction can be 
5 assessed, for example, in either of the two methods. 

1 . Screening on agar plates 

The functional identification of killer ORFs can be performed by spreading an 
aliquot of S. pneumoniae transformed cells containing phage Dp-1 ORFs onto agar 
plates containing different concentrations of sodium arsenite (0; 2.5; 5; and 7.5 ^iM). 
1 0 The plates are incubated overnight at 37°C, after which a growth inhibition of the 
ORF transformants on plates that contain arsenite are compared to plates without 
arsenite. 

2. Quantification of growth inhibition in li quid medium 

Cells containing different recombinant plasmids can be grown for overnight at 

15 37°C in LB medium supplemented with the appropriate antibiotic selection. These are 
then diluted to the mid log phase (OD 540 =.2) with fresh media containing antibiotic 
and transferred to 96-well microtitration plates (100 ^il/well). Inducer is then added at 
different final concentrations (ranging from 2.5 to 10 ^M) and the culture incubated 
for an additional 2 hrs at 37°C. The effect of expression of the phage Dp-1 ORFs on 

20 bacterial cell growth is then monitored by measuring the OD 540 and comparing the rate 
of growth to the culture not containing inducer. [As positive controls for growth 
inhibition, the kilA gene of phage lambda (Reisinger, GR., Rietsch, A., Lubitz, W. and 
Blasi, U. 1993 Virology #193: 1033-1036), and the holin/lysin genes of the 
Staphylococcus aureus phage Twort (Loessner, ML, Gaeng, S., Wendlinger, G., 

25 Maier, SK. and Scherer, S. 1998. FEMS Microbiology Letters #162:265-27 '4) can be 
subcloned into the ars inducible vector. An aliquot of the induced and uninduced 
culture can also be plated out on agar plates containing an appropriate antibiotic 
selection but lacking inducer. Following incubation overnight at 37°C, the number of 
colonies is counted. Any ORF showing bacteriostatic activity will show a lower, but 

30 detectable, number of colonies on the agar plates when grown in the presence of 
inducer as compared to when grown in the absence of inducer. Any ORF showing 
full bacteriocidal activity will show no colonies on the agar plates, when grown in the 
presence of inducer as compared to when grown in the absence of inducer. 
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All patents and publications mentioned in the specification are indicative of 

10 the levels of skill of those skilled in the art to which the invention pertains. All 

references cited in this disclosure are incorporated by reference to the same extent as 
if each reference had been incorporated by reference in its entirety individually. 

One skilled in the art would readily appreciate that the present invention is 
well adapted to carry out the objects and obtain the ends and advantages mentioned, 

15 as well as those inherent therein. The specific methods and compositions described 
herein as presently representative of preferred embodiments are exemplary and are not 
intended as limitations on the scope of the invention. Changes therein and other uses 
will occur to those skilled in the art which are encompassed within the spirit of the 
invention are defined by the scope of the claims. 

20 It will be readily apparent to one skilled in the art that varying substitutions 

and modifications may be made to the invention disclosed herein without departing 
from the scope and spirit of the invention. For example, those skilled in the art will 
recognize that the invention may suitably be practiced using a variety of different 
bacteria, bacteriophage, and sequencing methods within the general descriptions 

25 provided. 

The invention illustratively described herein suitably may be practiced in the 
absence of any element or elements, limitation or limitations which is not specifically 
disclosed herein. Thus, for example, in each instance herein any of the terms 
"comprising/* "consisting essentially of and "consisting of* may be replaced with 

30 either of the other two terms. The terms and expressions which have been employed 
are used as terms of description and not of limitation, and there is not intention that in 
the use of such terms and expressions of excluding any equivalents of the features 
shown and described or portions thereof, but it is recognized that various 
modifications are possible within the scope of the invention claimed. Thus, it should 

35 be understood that although the present invention has been specifically disclosed by 
preferred embodiments and optional features, modification and variation of the 
concepts herein disclosed may be resorted to by those skilled in the art, and thafsuch 
modifications and variations are considered to be within the scope of this invention as 
defined by the appended claims. 
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In addition, where features or aspects of the invention are described in terms of 
Markush groups or other grouping of alternatives, those skilled in the art will 
recognize that the invention is also thereby described in terms of any individual 
member or subgroup of members of the Markush group or other group. For example, 
5 if there are alternatives A, B, and C, all of the following possibilities are included: A 
separately, B separately, C separately, A and B, A and C, B and C, and A and B and 
C. Thus, for example, for the bacteria and phage specified herein, the embodiments 
expressly include any subset or subgroup of those bacteria and/or phage. While each 
such subset or subgroup could be listed separately, for the sake of brevity, such a 
1 0 listing is replaced by the present description. 

Thus, additional embodiments are within the scope of the invention and within 
the following claims. 
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Table 1 



Pliiages against human and animal pathogenic bacteria 

5 



I. Pathogen 
name 


Phage name 


II. Cat 
alo 


Origin/reference 


Acinetobacter 
calcoaceticus 


A3/2 

A10/45 

A36 

B9GP 

B9PP 

BS46 

E13 

E14 

531 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Ap3 
P78 




J. Bacteriol 1984. 157: 179-183 

J. Gen. Microbiol 1986.132: 2633-2636 


Acinetobacter 
haemolyticus 






Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Acinetobacter 

f si Aft CAnf I 

jonnsunii 






Felix d'Herelle Reference 

fVntri* On^hi*r Oii<*Hpp 


Acinetobacter sp. 


BP1 




J.Virol.l968.2:716-722 


G4, HP2, HP3 & 
HP4 




Can.J.Microbiol. 1966. 12: 1023-1030 & 
J. Virol. 1974. 13:46-52 & 
Arch. Virol. 1994. 1 35:345-354 


A1,A4, A9& 
196 




ArckViroU994.135:345-354 


HP1 




Can.J.Microbiol. 1966. 12: 1023-1030 


A 19, A23,A29, 
A31,A33,A34, 
A3759&2845 




J.Microsc (Paris) 1973.16:215-224 & 
CR.Hebdo Seances Acad.Sci.Ser D.Sci 
Natur(Paris)278:1907-1909 & 
Arch.Virol.l994.135:345-354 & 
Rev.Can.Biol. 1 970.29:3 1 7-320 


Actinobacillus 
actinomycetecomitans 






FEMS Microbiol Lett 1994. 1 19:329*337^, - " 
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Infec. Imraua 1982. 35: 343-349 








MoLGeruGenet 1998258: 323-325 




Aa<p247 




Oral Micriol. Immunol 1997.12: 40-46 


Actinomyces viscosus 




43146-B1 


The American Type Culture Collection 




- 




Infectlmmun. 1 985 .48:228-233 








Infectlmmun. 1 988.56:54-59 








Plasmid 1997.37:141-153 


Aeromonas hydrophila 


PM2** &PM3 




FEMS MicrobioI.Lett. 1990.57:277-282 




Aehl 

Ach2 

PM4 

PM5 

PM6 

T7-ah 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 
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Aeromonas 
salmonicida 


3 

25 
29 
31 
32 

40RR2.*t 

43 

51 

56 

59.1 

65 

Asp37 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 




55R.I 




Can. J. Microbiol. 1983. 29: 1458-1461 


Alteromonas espejiana 


PM2 0 * 


27025-B1 


The American Type Culture Collection 


Asticacaulis 
biprosthecum 






Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Asticcacaulis 
excentricus 


<|>Ac21 
4>Ac24 


15261-Bl 
15261-B2 
15261-B3 


The American Type Culture Collection 


Azotobacter vinelondii 


A91 

A31 
A41 

PAVl 
rAV 1 


1ZJ I O-XJ 1 

12518-B4 

12518-B5 

12518-B9 

12518-BlO 

13705-Bl 


TTip American Tvne Culture Collection 


Azotobacter sp. 






Viroloev 1972 49 439-452 


Bacteroides fragilis 


Bf-1 




Rev. Infect. Dis. 1979. 1 : 325-336 1 


B40-8 




FEMS Microbiol. Lett 1991. 66: 61-67 




HSP40 




Appl. Environ. Microbiol. 1989. 55: 2696- 
2701 




phiAl 




Zentralbl.bakteriol. 1972.222:57-63 


Bdellovibrio 
bacteriovorus 


MAC-1 




J. Gen. Microbiol. 1987. 133: 3065-3070 


Bdellovibrio sp. 


VL-1 




J.Virol.l973.12:1522-1533 


Bordetella 
brochiseptica 


214 




Zh.Mikrobiol.Epidemiol.Immuno. 1987.5:9- 
13 
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Bordetella 
parapertussis 






Felix d'Herelle Reference 
Centre,Quebec,Quebec 


/ 






Mol. Gen. Mikrobiol. Virusol. 1988.4: 22-25 








a.Mikrobiol.Epidemiol.Innnuno. 1987.5:9- 
13 




41405 




Zh.MikrobioLEpidemiol.Immuno. 1987.5:9- 
13 


Brucella abortus 






Felix d'Herelle Reference 
Centre,Quebec,Quebec 






23448-Bl 
23448-B2 
23448-B3 
17385-Bl 
17385-B2 


The American Type Culture Collection 




10/1 

24/D 

212/XV 








BK-2, TB & 
Fi** 




Zh.Mikrobiol.EpidemioLInimunobiol.1983.2: 
48-52 




R/c&R/O 




Dev. Biol. Stand. 1984.56: 55-62 " . .. - ■* 


Brucella canis 


R/c 




Dev. Biol. Stand. 1984.56: 55-62 


Brucella melitensis 


BK-2 


23456-B1 


The American Type Culture Collection 


Brucella suis 


Wb 




Zentralbl.Veterinarmed. 1975.22:866-867 
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Fi°* &TB 




Zh.Mikxobiol.Epidemiol.Immunobiol.1983.2: 








48-52 


Brucella sp. 






Can. J. Vet Res. 1989.53: 319-325 








Res. Vet. Sci. 1988. 44: 45-49 




it 




Zh.Mikrtobiol.Epidemiol.IinmuDobiol.1983.2: 








48 


Campylobacter coli 




411H..R1 


The American Tvoe Culture Collection 








Campylobacter coli 


18 




The American Tvoe Culture Collection 


\\AJ711 UJ 


19 


HJ 1 JO"D 1 






20 






i^umpyiouucicr jejuni 


i 
i 


35918-B1 


The American Type Culture Collection 




35919-B1 








35920-B1 






4 


35921-B1 






J 


35918-B2 






u 


35920-B2 






7 


35922-B2 






g 


35923-B1 






0 


35924-B1 






10 


35925-B1 






11 


35925-B2 






12 


35922-B2 






13 


35924-B2 






14 

i*t 


35922-B3 






17 


43133-B1 






18 


43134-B1 






19 


43135-B1 






20 


43136-B1 




Campylobacter 


HP1 




J. Med. Microbiol.1993. 38: 245-249 


(Helicobacter) pylori 








Chlamydia psittaci 


Chpl** 




J. Gen. Virol. 1989. 70: 3381-3390 


Clostridium 


CAK-1 




J.Bacteriol.l993.175:3838-3843 


acetobutylicum 
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Clostridium botulinum 






Nucleic Acids Res. 1990. 18: 1291 








Bioch.Biophys.res.Commun. 1 990. 1 7 1 . 1 304- 
1311 








MicrobioLimmunol 198 1.25:915-927 








J.Vet.Med.Sci.l992.54:675-684 




CE (3 &CEy 






Clostridium difficile 


41 &56 




J. Clini.Microbiol. 1985.21:251-254 
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Clostridium 






Rev.Can.Biol. 1 977 .36:205-2 1 5 


perfringens 














FEMS MicrobioLLett 1990.54:323-326 


Clostridium 




8074-B1 


The American Type Culture Collection 


sporogenes 


59 


17886-B1 






70 


17886-B3 






71 


17886-B4 






72S 


17886-B5 






72L 


17886-B6 




Clostridium tetani 


A&B 




Rev.Can.Biol. 1 978.37:43-46 


Corynebacterium 






Vopr.Virusol.1986.3 1 :577-584 


diphteriae 








Corynebacterium 


NN 


12319-B1 


The American Type Culture Collection 


pseudotuberculosis 








Corynebacterium sp 


DLC 2921/49 


12052-B1 


The American Type Culture Collection 
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Enterococcus faecalis 


42 


19948-B1 


The American Type Culture Collection 


Enterococcus faecium 


124 
133 


19950-B1 
19953-b2 
19953-B1 


The American Type Culture Collection 
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Escherichia coli 




11303-B14 

11303-B10 

I1303-B21 

8677-B1 

11303-B13 

13706-B4 


The American Type Culture Collection 


Escherichia coli 




15766-Bl 


The American Type Culture Collection 


(Cont'd) 




15766- Bl 
1242-B5 
15669-B2 

15767- B1 
11303-B16 
27-65-Bl 






C204 

El 

fl** 

f2** 

FCZ 

fd** 


25065-B2 
15669-B1 






15597-B1 






21816-B1 
23724-B9 
15593-B1 






25404-B1 

29746-B1 

2363 i-Bl 

25868-B1 

25298-B1 

25298-B2 

11303-B37 

11303-B24 


• 




Ifl** 


11303-B26 






11303-B27 
11303-B28 
11303-B29 
11303-B30 
11303-B33 
11303-B31 
11303-B25 
11303-B35 






MS2** 

MU9 

Mu-1 

Ox6 

PI** 

P4 sidi ** 

Q-P** 

R17** 

21K/1 

ZJ/2 


11303-B34 
11303-B36 
11303-B32 






13706-B5 
11303-B1 






11303-B2 

11303-B3 

11303-B4 

35060-B1 

35060-B2 

35060-B3 

11303-B5 

11303-B6 

11303-B7 

11303-B38 

12141-B1 
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Escherichia coli 
(Cont'd) 



547 

UVl 

UV47 

UV375 

a3** 

X ** 

XC-17 

X sus P-3 

X sus R-5 

X sus J-6 

X sus 0-8 

Xsus A-ll 

Xind* 

<*92 

fit 

$£cs70am-3 


11303-B20 

11303-B17 

11303-B15 

11303-B11 

11303-B18 

13706-B2 

23724-B2 

23724-Bl 

23724-B3 

23724-B4 

23724-B5 

23724-B6 

23724-B7 

23724-B8 

35860-B1 

13706-B3 

15597-B2 

13706-Bl 

49696-Bl 


The American Type Culture Collection 


G4** & <*&** 




Biochim.Biophysica Acta. 1992.1 130:277-288 


BF23** 




J.BacterioH977.129:265-275 


Mul 

IVlUi 




J.UltrastractRes. 1 966. 14:441-448 


Hpl7 




J.Mol.Biol. 1 99 1 .2 1 8:705-72 1 








T)L1 0#6 DLf 1 O. 

Rblo , Kwl & 
Rb69** 




T Rj>rt#»rin1 1000 17? '180-186 


HI** H3 H8 
K9, 

K18 & Oxl 




Mol.Gen.Genet. 1 990.22 1 :49 1 -494 


Ml**, Tula** & 
Tulb** 




J.Mol.Biol.l987.196:165-174 


K10 




J.Bacteriol. 1 979. 140:680-686 


Qsr* 




JJBacterioL 1985. 1 62 :256-262 


B278 




J.Gen.MicrobioL 1988. 1 34: 1333- 1338 


phi 80** 




FEMS Microbiol.Lett.1994. 119:71-76 


phi ml73 




Genetika 1985.21:673-675 


tf-1 




J.Gen.Microbiol. 1987. 1 33 :953-960 


P4 & phiR73 




Mol.Microbiol.1995. 1 8:201-208 


l r 2 




J.Gen.Microbiol.l982.128:2797-2804 


PRD1 




Virology 1990.177:445-451 


K3hx 




Mol.Gen.Genet.l987.206:l 10-1 15 


933J**& 
933W** 




iniect.irnniunity.iyoo.jj.1 jjM«tu ... v. 


H19-B** 




J.BacterioI. 1987.1 69:4308^*3 1 2 


Tcp-111 




Zentralbnl.BakterioLMikrobiol.Hyg.1988.270: 
41-51 
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Escherichia coli 
(Cont'd) 



N4** 




Vet.Microbiol.l992.30:203-212 


Phi 80 tip 




Ann.Inst.Pasteur.197 1 . 120: 12 1- 125 


Obeta 1 




J.Bacteriol.1978.133: 172-177 


P1CM 




J.Gen.MicrobioL1978.107:73-83 ! 


PA-2** 




J.Bacteriol.l990.172:1660-1662 


186*° 




Mol.Gen.Genet.l982.187:87-95 


186.IX.B 




MoLMicrobiol. 1992.6:2629-2642 


21** 




Virology 1983.129:484-489 ! 


P4** 




MicrobiolRev. 1993.57:683-702 


82** 




J.Biol.Chem. 1987.262:1 1721-1 1725 


PSP3 




J.Bacteriol. 1 996. 1 78:5668-5675 


HK022** 




Nucleic Acids Res. 1994.22:354-356 


D108** 




Nucleic Acids Res. 1986. 14:38 13-3825 


Rb49 




J.Mol.BioL 1 997.267:237-249 


Ike** 




J.Mol.BioL1985.181:27-39 


P22dis 




Mol.Gen.Genetl978.166:233-243 


N15** 




LBacteriol. 1 996. 178 : 1484- 1 486 


Ifl** 




Proc.R.Soc.Lond.B.Biol.Sci. 1 991 .245:23-30 


Stx2Phi-I & 




Infect.Irnmun.l998.66:4100-4107 


Stx2Phi-II 






18 




Virology 1987.156:122-126 


X 




J.Gen.Microbioi. 198 1 . 1 26:389-396 


AC3 




Mol.Microbiol.l991.5:715-725 
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BW-1 




Felix d'Hcrclle Reference 




C-l 




centre, vjueDec,vJueDec 




E920g 








Esc-7-11 








H19J 








Haiti 








HK243 








la 








K20 








K30 








KLj 








M 








Mu** 








O103 








0157:H7 








P1D 








ptl 








PilHo 








PR64FS 








PR772 








SS4 








P4Q 








Jlvir*» 








CM 








09-1 








92 






Haemophilus 


HP1** 




Nucleic Acids Res. 1996.24:2360-2368 


influenzae 


S2** 




Gene 1997. 196: 139-144 


Halobacterium 


S45 




Felix d'Herelle Reference 


cutirubrum 






Centre^Ouebec^Quebec 


Halobacterium 






Felix d'Herelle Reference 


halobium 






Centre,Quebec,Quebec 








CanJ.Microbiol.l982.28:916-921 


Halobacterium 






Biol.Chem.Hoppe Seyler 1994.3J5I747-757 


salinarium 
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Klebsiella oxytoca 


tf-1 




LGen.MiCTobiol.1987. 133:953-960 


Klebsiella pneumoniae 


60 
92 


23356- B1 

23357- B1 


The American Type Culture Collection 




K19Q 




Felix d'Herelle Reference 
Centre,Ouebec,Ouebec 




FC3-1 &FC3-9 




Can J.Microbiol.1991 .37:270-275 




FC3-10 




FEMS MicrobioLLert 1991 .67:29 1-297 


Klebsiella sp. 


Kll** 




Mol.Gen.Genet. 1990.221:283-286 




LEI, LE3 & LE4 




Res.Microbiol.l990.141:1131-1138 


Listeria 


243 


23074-B1 


The American Type Culture Collection 


monocytogenes 


i i\n 1111 Pr 

197,1313 cl 
9425 




Appl.Environ.MicrobioU997.63:3374-3377 




H387 & H387-A 




Aopl.Environ.Microbiol.l993.59:2914-2917 




5775,6223 
&12682 




APMIS 1993.101:160-167 




2389, 2671, 
4211 & 2685 




Intervirology 1994.37:31-35 & 

7^«+t-nlK1 Rakprinl Mlkrohifll HvC 1986 26 Tl 

Z^enn^iDi.isainenoi.iYiiiuuuiui.iijfg.i-'uvi.A.v* . » 




4b. 4ab. 4g & 3c 




Ann.Microoioi (ransj iy / / .iio.loj 




A118, A500& 

A CI IS* 




MoLMicrobiol. 1995.16:1231-1241-992 




1,3,4,5,6,7, 8, 
o in ii 14 IS 

V, lu, II, l**, u, 

16.17. 19&20 




Ann.Microbiol. (Paris) 1979. BOB: 179- 189 




l/2a, l/2b, 3c, 
4ab, 6a & 6b 




Clin.Invest.Med. 1984.7:229-232 




$LMUP35 
9685 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Listeria innocua 


491 1 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Micrococcus luteus 


N3 
N4 
N8 


4698-B1 
4698-B4 
4698-2 
4698-B3 


The American Type Culture Collection 


Micrococcus luteus 


N17 




Can.J.Microbiol. 1979.25:1027-1035 


Mycobacterium 
smegmatis 


BK-3 
Bol** 
Bo 6 
Bo 611 
B06III 
Mc-2 
Mc-4 
NN 

Phagus lacticola 
Rl 


27203- B1 

27204- B1 

27205- B1 
27205-B2 
27205-B3 
607-B6 
607-B7 
11727-B1 
11759-Bl 
607-Bl 


The American Type Culture Collection 

— ■? 
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HER 317 
HER 330 
HER 333 
HER 335 
HER 334 
HER 331 
HER 316 


Felix d'Herelle Refrence 
Centre,Quebec,Quebec 




Legendre 
Leo 
Roy 
Sedge 












Mol.Microbiol. 1993.7:395-405 








J.Mol.Biol.l998.279:143-164 








Proc.NaUj\cad.SciUSA.1988.84:2833-2837 








MoLBioLRep. 1981.30:11-15 








Proc.Natl.Acaa.Sci.Ub A l yy /.y4: luvo l- 
10966 




29M,31M,122, 
154, 37, 29D, 46, 
139,110, 141, 
74D, AG1 & 
DS6A 




Arch.Virol.l993.133:39-49 & 
Am.Rev.Respir.Dis. 1975.1 12:17-22 


Mycobacterium 
fortuitum 


Bo 4 
Bo 7 


23052-B1 
27207-B1 
27207-B2 


The American Type Culture Collection 
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Mycobacterium leprae 






Ann.Microbiol. (Paris) 1982.133:93-97 


Mycobacterium 
tuberculosis 


DS6A 


25618-B1 
25618-B2 
4243-Bl 


The American Type Culture Collection 




110, 139&33D 




Arch. Virol. 1 993. 1 33 :39-49 




AG1,GS4E, 
BGl, 

PH&BKl 




The Biology of Mycobacteria. Academic 
Press/Toronto 1982 (Ratledge & Stanford) 
1982309-351 


Mycobacterium sp 


Phagus pellcgrini 

NN 

Bl 


11760- Bl 

11761- Bl 
23239-Bl 


The American Type Collection Culture 
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TM4 nh60 

ph72, 

PhAE39 

phAE40 

&Bxbl 




Microbiology 1995.141:1173-1181 




C2 




Experentia 1969.25:1 112-1113 




18 & 115 




J.Gen.ViroU987.68:949-956 




63 




Gruzlica 1968.36:617-622 




phlei & 
butyricum 




J.Gen.Virol.l975.29:235-238 




MyF3P-59a 




Z Allg.Mikrobiol. 1 968.8:29-37 




Bo2a 




J.Gea Virol. 1973.20:75-87 




D4.D28&D32 




J.Exptl.Med.l966.123:327-340 




HC 




LBacteriol. 1 963.86:608-609 


Mycobacterium 
vaccae 


B5 


15483-B1 


The American Type Culture Collection 


Mycobacterium phlei 


NN 
Bo 2 
Bo2h 
Bo 3 


11728-Bl 
11758-Bi 
27086-B2 
27086-Bl 


The American Type Culture Collection 


Mycoplasma 
arthritidis 


MAV1** 




Infectlmmunity. 1995.63:4016-4023 


Mycoplasma hyorhinis 


Hr-1 




Arch.Virol.l983.77:8i-85 


Mycoplasma 
pneumoniae 


Br-1 




Arch.Virol.l983.75:l-15 


Mycoplasma pulmonis 






Plasmid 1995. 33:41-49 


Mycoplasma sp. 






J.Gen.Microbiol. 1985: 131:31 17-3 126 








J. Virol.l986.59:584-590 








Gene 1994. 141: 1-8 
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Microbios 1990. 64: 111-125 






Infection& Immunity 1995. 63: 4016-4023 






Med.BioU982.60:l 16-120 


MV-L2& 

• 




Arch.Virol.l979.61:289-296 






Acta. Virol. 1978.22:443-450 






J.Gen. Virol. 1 979.42:3 1 5-322 






Virology 1973.55:118-126 



WO 00/32825 



PCT/IB99/02040 
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Science 1971.173:725-727 


Neisseria perflava 






J.Clin.Microbiol.1976. 4:87-91 


Nocardia erythrypolis 


cpC 




J.Gen.Virol.l974.23:247-254 


cpEC 




J.Bacteriol. 1976. 126: 1 104-1 107 


Pasteurella multocida 


B225 




Arch.Exp. Veterinarmed. 1 98 1 .35:433-436 


B939a 




AmJ.VetRes.l978.39:1565-1566 


Nos.115,32, 967 
& 

1075 




VetMed.Nauki. 1977.14:33-36 


Propionibacterium 
acnes 


NN 


29399-B1 


The American Type Collection Culture 
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Pseudomonas 
aeruginosa 





12175-B1 


The American Type Culture Collection 


2 


12175-B2 




2A 


12175-B3 




2B 


12175-B4 




11 


14205-B1 




16 


14206-B1 




24 


14207-B1 




27 


14208-B1 




44 


14209-B1 




73 


14210-B1 




95 


1421 1-B1 




109 


14212-B1 




113 


14213-B1 




249 


14214-B1 




B3 


15692-B1 




Hoff2 


14203-B1 




Hoff3 


14204-B1 




Pa 


12055-B1 




Pb 


12055-B2 




PB-1 


15692-B3 




Pc 


12055-B3 




Pf 


25102-B1 














Felix d'Herelle Reference 






Centre,Quebec,Quebec 


7&31 






PI3<.o 




J.Virol. 1 983 .47 :22 1 -223 


(P-MC 




Can.J.MicrobioU969.15:l 179-1 186 


Pfl** 




J.MoLBiol. 1 99 1 .2 1 8 :349-364 


PR4** 




J.Gen. Virol. 1979.43:583-592 


A7 




J.Bacteriol.!992.174:2407-241 1 


KF1 




J.Biochem. 1983.93:61-71 


<CTX*° 




Mol.Microbiol. 1993.4: 1703-1709 


a** 




J.ViroL 1977.24: 135-141 
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<pKZ,21,cpNZ, 
PMN17, PTB80, 
68, PB-1.E79, 
16, 

109, 352, 1214, 
F8,71,337,M4, 
9C17, SL2, B17, 
Li-24, <pmnP78, 
PS17**,<|>1,73, 
M6, Li-2, 7, 
q>mnF82, 
PTB2, PTB20, 
PTB42, q>KF77, 
31.PTB21, 
119x, 

q>PLS27, B3, 
258, 

Hwi2,PM57, 
PM62.PM105, 
148,PM681, 
198, 

218, 222,242, 
246, 

PC131,(pCll, 
SL5, 

D3112**,Jbl9, 
F7, 

PM69,PM13, 
PM61,PM113, 

(P240, 249 & 269 
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Pseudomonas 


297,309,318, 




Arch.ViroU993.131:141-151 


aeruginosa 


11. 






(Cont'd) 

















Patent provided by Sughrue Mion, PLLC 



- http://www.sughrue.com 



WO 00/32825 PCT/OB99/02040 



133 



Pseudomonas cepacia 






Felix d'Herelle Reference 
Centre, Quebec, Quebec 


Pseudomonas fragi 


wv 

-JIZ 


27362-Bl 
27363 Bl 


The American Type Culture Collection 


Pseudomonas 

nh/vipnlicnln 

^/IllUvl/ilbVIU 






Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Pseudomonas putida 


eh-1 


12633-B1 


The American Type Culture Collection 


Pseudomonas syringae 




40492-B1 
21781-B1 


The American Type Culture Collection 


Pseudomonas sp. 


PPS-UJ 




me American xype culture v^ouecnon 


Salmonella bareilly 


Sab 2 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Salmonella enteritidis 


1, 2,3 & 6 




Epidemiol.Infect.1995.1 14:227-236 


2a, 3a, 4a, 5a, 6a, 
7a, 8a, 9a, 15, 
19, 20 &21** 




Vet Med.Nauki. 1 975. 1 2:55-60 


Salmonella newington 


Epsilon 34 




J.StructBioL 1995.115:283-289 


Salmonella newport 


16-19 


27869-Bl 
27869-B2 


The American Type Culture Collection 






Felix d'Herelle Reference 
Centre,Quebec,Quebec 


Salmonella paratyphi 


Paratyphoid A 


19940-B1 
12176-B1 


The American Type Culture Collection 


Jersey 




^1 " _ J 1 1 T 11,—. — - - - - 

relix a Herelle Reference 
Centre Ouehec Ouehec 


Salmonella 
senftenberg 


<5a«;T 1 SaT 0 Sal 

3, 

SaL4, SaL5 & 
SasL6 




Indian J MedRes 1997 105 47-52 


Salmonella 
typhimurium 


P22** 
SL-1 


19585-B1 
40282 


The American Type Culture Collection 


MB78** 




J.Virol. 1982.41: 1038-1043 - ~ * 


SE1 




J.Gen.Microbiol.l986.132:1035-1041 


LT2 




Virology 1971.45:835-636 


ES18** 




Virology 1970.42:621-632 






J.Virol.l985.56:1034-1036 
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P1CM clr-100 




Mol.Gen.Genet 1 975. 138: 1 1 3-126 




F22 




GenetRes. 1986.48: 139-143 




Felsl 




J.Gen.Virol. 1978.38:263-272 




Fels2 




Genet.Res. 1 986.48: 1 39-1 43 




Px 




Mol.Gen.Genet.1970.108: 184-202 




Pike 




Virology 1974.60:503-514 




A3&A4 




JLBacteriol. 1987.169:1003-1009 




HT 




Genet.Res.l976.27:315-322 


Salmonella 


IRA 




J.Basic Microbiol. 1990.30:707-716 


typhimurium 


Mudl 




Mol.Gen.Genet. 1986.202:327-330 


(Cont'd) 


P22(cir4-l,cir5- 
1 &cir6-l) 




Mol.Gen.Genetl984.198:105-109 




BF 23^ 




Mol.Gen.Genet 1 976. 147: 1 95-202 




Kbl 




J.Bacteriol.1974.1 17:907-908 




P221dis 




J.Gen. Virol. 1 978.4 1 :367-376 




PRD1 0 * 




Virology 1990.177:445-451 




I r 2** 




J.Gen.Microbiol. 1 982. 1 28:2797-2804 




tf-1 




J.Gen.Microbiol. 1987.133:953-960 








J.Gen.Microbiol. 1 98 1 . 1 26:389-396 


Salmonella 
typhosa/typhi 


8 

23 
25 

HO 

53 
163 

i n< 
i tj 

Vil 

ViVI 


19937- B1 

19938- Bl 

19939- Bl 

19942- Bl 

19943- B1 

19946- Bl 

1 77TV W I 

19947- B1 
27870-B1 
27870-B2 


The American Type Culture Collection 




01 




Felix d'Herelle Refrence 
Centre,Quebec,Quebec 




Vill 




Chung Hua Liu Hsing Ping 
H.T.C. 1992. 13:288 




j2 




J.Gen.MicrobioL 1983. 129:3395-33400 


Salmonella sp. 


P3 

P9a 
P9c 
P10 
102 

Chi(x) 
R34 


25957-B1 

25957-B2 

25957-B3 

25957-B4 

25957-B5 

19945-B1 

9842-B1 

97541 


The American Type Culture Collection 




MG40 




Virology 1968.34:521-530 




P14 




Microb.Pathog. 1990.8:393-402 




PSP3 




Virology 1992.188:414 _ . - 




Ike** 




Zentralbl.Bakteriol. 1976.234:294-304 




P27 & 9NA 




J.Virol.l986.12:921-931 


Sphaerotilus natans 


SN1 




Appl.Environ.Microbiol. 1979.37: 1025-1030 
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Shigella dysenteriae 


P2 
^80 


23351-B1 

11456b 

11456a-Bl 


The American Type Culture Collection 


Shigella flexeneri 
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J.ViroU981.40:551-559& 




Cp-7**, Cp-9**, 




Eiir.J.Biochem.l979.101:59-64 & 




co-1 & co-2 




Microbial Drug Resistance 1997.3:165-176 




HB-623 & HB- 




J.ViroU990.64:5149-5155 




746 








EJ-1** 




J.BacterioL 1 992. 1 74:55 1 6-5525 




Do-2&Dp-4 




J.Virol. 1978.26:22 1-225 
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Yersinia enterocolitica 
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Table 2 



bacteriophage 77, complete genome sequence, 41708 nucleotides 

1 gatcaaaata cttggggaac ggttagggag taaacttcgc gataatttta aaaattcatg 

61 tataaccccc ctcttataac cattttaagg caggtgatga aatggagatt atagtcgatg 

121 aaaatttagt gcttaaagaa aaagaaaggc tacaagtatt atataaagac atacctagca 

181 ataaattaaa agtagttgat ggtttaatta ttcaagcagc aaggctacgt gtaatgcttg 

241 attacatgtg ggaagacata aaagaaaaag gtgattatga tttatttact caatctgaaa 

301 aggcgccacc atatgaaagg gaaagaccag tagccaaact atttaatgct agagatgctg 

361 catatcaaaa aataatcaaa caattatcgg atttattgcc cgaagagaaa gaagacacag 

421 aaacgccatc tgatgattac ctatgattag taataaatac gttgatgaat atataaattt 

481 gtggaaacaa ggaaagataa ttttaaataa agaaagaatt gatctcttta attatctaca 

541 aaaacatata tattcacgag atgatgtata ttttgatgaa cagaaaatcg aggattgtat 

601 caaatttatt gaaaaatggt attttccaac attaccattt caaaggttta tcatagctaa 

661 tatatttctt atagataaaa atacagatga agctttcttt acagaatttg ctattttcat 

721 gggacgtgga ggcgggaaaa acggtctaat aagtgctatt agtgattttc tttctacgcc 

781 cttacacgga gttaaagaat atcacatctc cattgttgct aatagtgaag atcaagcaaa 

841 aacatcgttt gatgaaatca gaaccgtttt aatggataac aaacgaaata agacgggtaa 

901 aacgccaaaa gctccttatg aagttagtaa agcaaaaata ataaaccgtg caactaaatc 

961 ggttattcga tataacacat caaacacaaa aaccaaagac ggtggacgtg aggggtgtgt 

1021 tatttttgat gaaattcatt atttctttgg tcctgaaatg gtaaacgtca aacgtggtgg 

1081 attaggtaaa aagaaaaata gaagaacgtt ttatataagt actgatggtt ttgttagaga 

1141 gggttatatc gatgcaatga agcacaaaat tgcaagtgta ttaagtggca aggttaaaaa 

1201 tagtagattg tttgcttttt attgtaagtt agacgatcca aaagaagttg atgacagaca 

1261 gacgtgggaa aaggcgaacc caatgttaca taaaccgtta tcagaatacg ctaaaacact 

1321 gctaagcacg attgaagaag aatataacga tttaccattc aaccgttcaa ataagcccga 

1381 attcatgact aagcgaatga atttgcctga agttgacctt gaaaaagtaa tagcaccatg 

1441 gaaagaaata ctagcgacta atagagagat accaaattta gataatcaaa tgtgtattgg 

1501 tggtttagac tttgcaaaca ttcgagattt tgcaagtgta gggctattat tccgaaaaaa 

1561 cgatgattac atttggttag gacattcgtt tgtaagacaa gggtttttgg atgatgtcaa 

1621 attagaacct cctattaaag aatgggaaaa aatgggatta ttgaccattg tcgatgatga 

1681 tgtcattgaa attgaatata tagttgattg gtttttaaag gctagagaaa aatatgggct 

1741 tgaaaaagtc atagctgata attatagaac tgatattgta agacgtgcgt ttgaggatgc 

1801 tggcataaaa cttgaagtac ttagaaatcc aaaagcaata catggattac ttgcaccacg 

1861 tatcgataca atgtttgcga aacataacgt aatatatgga gacaatcctt tgatgcgttg 

1921 gtttactaat aatgttgctg taaaaatcaa gccggatgga aataaagagt atatcaaaaa 

1981 agatgaagtc agacgtaaaa cggatggatt catggctttt gttcacgcat tatatagagc 

2041 agacgatata gtagacaaag acatgtctaa agcgcttgat gcattaatga gtatagattt 

2101 ctaatagagg aggtgagaca tgagtattct agaaaagata tttaaaacta ggaaagatat 

2161 aacatatatg cttgatttag atatgataga agatctatca caacaagcgt atgtgaaacg 

2221 tttagcgatt gatagttgta ttgaatttgt tgcgcgagct gtcgctcaaa gtcattttaa 

2281 agtattggaa ggtaatagaa ttcaaaagaa tgatgtttac tacaagttaa atataaaacc 

2341 aaatactgac ttatcaagcg atagtttttg gcaacaagtt atatataaac taatttatga 

2401 taacgaggtt ttaatcgtag taagtgacag caaagaatta cttatcgcag atagctttta 

2461 cagagaagag tacgctttgt atgatgatat attcaaagat gtaacggtta aagattatac 

2521 ttatcaacgt actttcacaa tgcaagaggt catatattta aagtacaaca acaataaagt 

2581 gacacacttt gtagaaagtc tattcgaaga ttacgggaaa atattcggaa gaatgatagg 

2641 tgcacaatta aaaaactatc aaataagagg gattttgaaa tctgcctcta gcgcatatga 

2701 cgaaaagaat atagaaaaat tacaagcgtt cacaaataaa ttattcaata cttttaataa 

2761 aaatcaacta gcaatcgcgc ctttgataga aggttttgat tatgaggaat tatctaatgg 

2821 tggtaagaat agtaacatgc ctttttctga attgagtgag ctaatgagag atgcaataaa 

2881 aaatgttgcg ttgatgattg gtatacctcc aggtttgatt tacggagaaa cagctgattt 

2941 ggaaaaaaac acgcttgtat ttgagaagtt ctgtttaaca cctttattaa aaaagattca 

3 001 gaacgaatta aacgcgaaac tcataacaca aagcatgtat ttgaaagata caagaataga 

3061 aattgtcggt gtgaataaaa aagacccact tcaatatgct gaagcaattg acaaacttgt 

3121 aagttctggt tcatttacaa ggaatgaggt gcggattatg ttaggtgaag aaccatcaga 

3181 caatcctgaa ttagacgaat acctgattac taaaaactac gaaaaagcta acagtggtga 

3241 aaatgatgaa aaagaaaaag atgaaaacac tttgaaaggt ggtgatgaag atgaaagcgg- 

3301 agattaaagg cgtcatcgtt tccaacgaag ataaatgggt ttacgaaatg cttggtatgg 

3361 attcgacttg tcctaaagat gttttaacac aactagaatt tagtgatgaa gatgttgata 

3421 ttataattaa ctcaaatggt ggtaacctag tagctggtag tgaaatatat acacatttaa 

3481 gagctcataa aggcaaagtg aatgttcgta tcacagcaat agcagcaagt gcggcatcgc 

3541 ttatcgcaat ggctggtgac cacatcgaaa tgagtccggt tgctagaatg atgattcaca 
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3601 atccttcaag tattgcgcaa ggagaagtga aagatctaaa tcatgctgca gaaacattag 

3661 aacatgttgg tcaaataatg gctgaggcat atgcggttag agctggtaaa aacaaacaag 

3721 aacttataga aatgatggct aaggaaacgt ggctaaatgc tgatgaagcc attgaacaag 

3781 gttttgcgga tagtaaaatg tttgaaaacg acaatatgca aattgtagca agcgatacac 

3841 aagtgttacc gaaagatgta ttaaaccgtg taacagcttt ggtaagtaaa acgccagagg 

3901 ttaacattga tatcgacgca atagcaaata aagtaattga aaaaataaat atgaaagaaa 

3961 aggaatcaga aatcgatgtt gcagatagta aattatcagc aaatggattt tcaagattcc 

4021 ttttttaata caaaaatagg aggtcataaa atgactataa atttatcgga aacattcgca 

4081 aatgcgaaaa acgaatttat taatgcagta aacaacggtg aaccgcaaga aagacaaaat 

4141 gaattgtacg gtgacatgat taaccaacta tttgaagaaa ctaaattaca agcaaaagca 

4201 gaagctgaaa gagtttctag tttacctaaa tcagcacaaa ctttgagtgc aaaccaaaga 

4261 aatttcttta tggatatcaa taagagtgtt ggatataaag aagaaaaact tttaccagaa 

4321 gaaacaattg atagaatctt cgaagattta acaacgaatc atccattatt agctgactta 

4381 ggtactaaaa atgctggttt gcgtttgaag ttcttaaaat ccgaaacttc tggcgtggct 

4441 gtttggggta aaatctatgg tgaaattaaa ggtcaattag atgctgcgtt cagtgaagaa 

4501 acagcaattc aaaataaatt gacagcgttt gttgttttac caaaagattt aaatgatttt 

4561 ggtcctgcgt ggattgaaag atttgttcgt gttcaaatcg aagaagcatt tgcagtggcg 

4621 cttgaaactg cgttcttaaa aggtactggt aaagaccaac cgattggctt aaaccgtcaa 

4681 gtacaaaaag gtgtatcggt aactgatggt gcttatccag agaaagaaga acaaggtacg 

4741 cttacatttg ctaatccgcg cgctacggtt aatgaattga cgcaagtgtt taaataccac 

4801 tcaactaacg agaaaggtaa atcagtagcg gttaaaggta atgtaacaat ggttgttaat 

4861 ccgtccgatg cttttgaggt tcaagcacag tatacacatt taaatgcaaa tggcgtatat 

4921 gttactgctt taccatttaa tttgaatgtt attgagtcta cagttcaaga agcaggtaag 

4981 gttttaacgt acgttaaagg tctatatgat ggttatttag ctggtggtat taatgttcag 

5041 aaatttaaag aaacacttgc gttagatgat atggatttat acactgcaaa acaatttgct 

5101 tacggcaaag cgaaagataa taaagttgct gctgtttgga aattagattt aaaaggacat 

5161 aaaccagctt tagaagatac cgaagaaaca ctataaaatt ttatgaggtg ataaaatggt 

5221 gaaatttaaa gttgttagag aatttaaaga catagagcac aatcaacaca agtacaaagt 

5281 aggggagttg tatccagctg aagggtataa caatcctcgt gttgaattgt tgacaaatca 

5341 aatcaaaaat aagtacgaca aagtttatat cgtaccttta gataagctga caaaacaaga 

5401 attattagaa ctatgcgaat cattacaaaa aaaagcgtct agttcaatgg ttaaaagtga 

5461 aatcatcgac ttattgaatg gtgaagacaa tgacgattga tgatttgctt gtcaaattta 

5521 aatcacttga aaagattgac cataattcag aggatgagta cttaaagcag ttgttaaaaa 

5581 tgtcgtacga gcgtataaaa aatcagtgcg gagtttttga attagagaat ttaataggtc 

5641 aagaattgat acttatacgc gctagatatg cttatcaaga tttattagaa cacttcaacg 

5701 acaattacag acctgaaata atagattttt cgttatctct aatggaggta tcagaagatg 

5761 aagaaagtgt ttaagaaacc tagaattaca actaaacgtt taaatacgcg tgttcatttt 

5821 tataagtata ctgaaaataa tggtccagaa gctggagaaa aagaagaaaa attattatat 

5881 agctgttggg cgagtattga tggtgtctgg ttacgtgaat tagaacaagc tatctcaaac 

5941 ggaacgcaaa atgacattaa attgtatatt cgtgatccgc aaggtgatta tttacccagt 

6001 gaagaacatt atcttgaaat tgaatcaaga tatttcaaaa atcgttcgaa tataaagcaa 

6061 gtatcaccag atttggataa taaagacttt attatgattc gcggaggata tagttcatga 

6121 gtgtgaaagt gacaggtgat aaagcattag aaagagaatt agaaaaacat tttggcataa 

6181 aagagatggt aaaagttcaa gataaggcgt taatagctgg tgctaaggta attgttgaag 

6241 aaataaaaaa acaactcaaa ccttcagaag actcaggagc actgattagt gagattggtc 

6301 gtactgaacc tgaatggata aaggggaaac gtactgttac aattaggtgg cgtgggcctt 

6361 ttgaacgatt tagaatagta catttaattg aaaatggtca tgttgagaaa aagtcaggaa 

6421 aatttgtaaa acctaaagct atgggtggga ttaatagagc aataagacaa gggcaaaata 

6481 agtattttga gacgctaaaa agggagttga aaaaattgtg attgatattt tgtacaaagt 

6541 tcatgaagtg attagtcaag acagaattat tagagagcac gtaaatatca ataatattaa 

6601 gttcaataaa taccctaatg taaaagatac tgatgtacct tttattgtta ttgacgatat 

6661 cgacgaccca atacctacaa cttatactga cggagatgag tgtgcatata gttatattgt 

6721 ccaaatagat gtttttgtta agtacaatga tgaatataat gcgagaatca taagaaataa 

6781 gatatctaat cgcattcaaa agttattatg gtctgaacta aaaatgggaa atgtttcaaa 

6841 tggaaaaccg gaatatatag aagaatttaa aacatataga agctctcgcg tttacgaggg 

6901 cattttttat aaggaggaaa attaaatggc agtaaaacat gcaagtgcgc caaaggcgta 

6961 tactaacatt actggtttag gtttcgctaa attaacgaaa gaaggcgcgg aattaaaata 

7021 tagtgatatt acaaaaacaa gaggattaca aaaaattggt gttgaaactg gtggagaact 

7081 aaaaacagct tatgctgatg gcggtccaat tgaatcaggg aatacagacg gagaaggtaa 

7141 aatctcatta caaatgcatg cgttccctaa agagattcgc aaaattgttt ttaatgaaga 

7201 ttatgatgaa gatggcgttt acgaagagaa acaaggtaaa caaaacaatt acgtagctgt 

7261 atggctcaga caagagcgta aagacggtac atttagaaca gttttattac ctaaagttat- 

7321 gtttacaaat cctaaaatcg atggagaaac ggctgagaaa gattgggatt tctcaagtga 

7381 agaggttgaa ggtgaggcac ttttcccttt agttgataat aaaaagtcag tacgtaagta 

7441 tatctttgat tcagctaaca tgacaaatca tgatggagac ggtgaaaaag gcgaagaggc 

7501 tttcttaaag aaaattttag gcgaagaata tactggaaac gtgacagagg gtaacgaaga 

7561 aactttgtaa caaaaccggc ttcatcggaa actgcggtaa agtcggttaa tataccagat 

7621 agcattaaaa cacttaaagt tggcgacaca tacgatttaa atgttgtagt agagccatct 
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7681 aatcaaagta agttattgaa atacacaaca gatcaaacga atattgtatc aatcaatagt 

7741 gatggtcaag ttactgcgga agcacaaggc attgctacgg ttaaagcaac agttggtaat 

7801 atgagtgaca ctataacaat aaatgtagaa gcataagagg gggcaacccc tctattttat 

7861 ttgaaaataa ggagagtatt ataaaatggc aaaattaaaa cgtaacatta ttcaattagt 

7921 agaagatcca aaagcaaatg aaattaaatt acaaacgtac ttaacaccac acttcatttc 

7981 atttgaaatt gtatacgaag caatggattt aatcgatgat attgaggacg aaaatagcac 

8041 gatgaagcca agagaaatcg ctgacagatt gatggatatg gttgtaaaaa tttacgataa 

8101 ccaattcaca gttaaagacc taaaagaacg tatgcatgca cctgatggaa tgaatgcact 

8161 tcgtgaacaa gtgattttca ttactcaagg tcaacaaact gaggaaacta gaaattttat 

8221 ccagaacatg aaataaagcc tgaagattta acatataaag caatgttgaa aaatatggat 

8281 actctcatga tggacttaat tgaaaatggt aaagacgcta acgaagtttt aaaaatgcca 

8341 tttcattatg tgctttccat atatcaaaat aaaaataatg acatttctga agaaaaagca 

8401 gaggctttaa ttgatgcatt ttaaccttaa ccgtttggtt agggttattt ctttgaactt 

8461 ttttagaaag gaggtaaaaa atgggagaaa gaataaaagg tttatctata ggtttggatt 

8521 tagatgcagc aaatttaaat agatcatttg cagaaatcaa acgaaacttt aaaactttaa 

8581 attctgactt aaaattaaca ggcaacaact tcaaatatac cgaaaaatca actgatagtt 

8641 acaaacaaag gattaaagaa cttgatggaa ctatcacagg ttataagaaa aacgttgatg 

8701 atttagccaa gcaatatgac aaggtatctc aagaacaggg cgaaaacagt gcagaagctc 

8761 aaaagttacg acaagaatat aacaaacaag caaatgagct gaattattta gaaagagaat 

8821 tacaaaaaac atcagccgaa tttgaagagt tcaaaaaagc tcaagttgaa gctcaaagaa 

8881 tggcagaaag tggctgggga aaaaccagta aagtttttga aagtatggga cctaaattaa 

8941 caaaaatggg tgatggttta aaatccattg gtaaaggttt gatgattggt gtaactgcac 

9001 ctgttttagg tattgcagca gcatcaggaa aagcttttgc agaagttgat aaaggtttag 

9061 atactgttac tcaagcaaca ggcgcaacag gcagtgaatt aaaaaaattg cagaactcat 

9121 ttaaagatgt ttatggcaat tttccagcag atgctgaaac tgttggtgga gttttaggag 

9181 aagttaatac aaggttaggt tttacaggta aagaacttga aaatgccaca gagtcattct 

9241 tgaaattcag tcatataaca ggttctgacg gtgtgcaagc cgtacagtta attacccgtg 

9301 caatgggcga tgcaggtatc gaagcaagtg aatatcaaag tgttttggat atggtagcaa 

9361 aagcggcgca agctagtggg ataagtgttg atacattagc tgatagtatt actaaatacg 

9421 gcgctccaat gagagctatg ggctttgaga tgaaagaatc aattgcttta ttctctcaat 

9481 gggaaaagtc aggcgttaat actgaaatag cattcagtgg tttgaaaaaa gctatatcaa 

9541 attggggtaa agctggtaaa aacccaagag aagaatttaa gaagacatta gcagaaattg 

9601 aaaagacgcc ggatatagct agcgcaacaa gtttagcgat tgaagcattt ggtgcaaagg 

9661 caggtcctga tttagcagac gctattaaag gtggtcgctt tagttatcaa gaatttttaa 

9721 aaactattga agattcccaa ggcacagtaa accaaacatt taaagattct gaaagtggct 

9781 ccgaaagatt taaagtagca atgaataaat taaaattagt aggtgctgat gtatgggctt 

9841 ctattgaaag tgcgtttgct cccgtaatgg aagaattaat caaaaagcta tctatagcgg 

9901 ttgattggtt ttccaattta agtgatggtt ctaaaagatc aattgttatt ttcagtggta 

9961 ttgctgctgc aattggtcct gtagtttttg ggttaggtgc atttataagt acaattggca 

10021 atgcagtaac tgtattagct ccattgttag ctagtattgc aaaggctggt ggattgatta 

10081 gttttttatc gactaaagta cctatattag gaactgtctt cacagcttta actggtccaa 

10141 ttggcattgt attaggtgta ttggctggtt tagcagtcgc atttacaatt gcttataaga 

,10201 aatctgaaac atttagaaat tttgttaatg gtgcaattga aagtgttaaa caaacattta 

10261 gtaattttat tcaatttatt caacctttcg ttgattctgt taaaaacatc tttaaacaag 

10321 cgatatcagc aatagttgat ttcgcaaaag atatttggag tcaaatcaat ggattcttta 

10381 atgaaaacgg aatttccatt gttcaagcac ttcaaaatat atgcaacttt attaaagcga 

10441 tatttgaatt tattttaaat tttgtaatta aaccaattat gttcgcgatt tggcaagtga 

10501 tgcaatttat ttggccggcg gttaaagcct tgattgtcag tacttgggag aacataaaag 

10561 gtgtaataca aggtgcttta aatatcatac ttggcttgat taagttcttc tcaagtttat 

10621 tcgttggtga ttggcgagga gtttgggacg ccgttgtgat gattcttaaa ggagcagttc 

10681 aattaatttg gaatttagtt caattatggt ttgtaggtaa aatacttggt gttgttaggt 

10741 actttggcgg gttgctaaaa ggattgatag caggaatttg ggacgtaata agaagtatat 

10801 tcagtaaatc tttatcagca atttggaatg caacaaaaag tatttttgga tttttattta 

10861 atagcgtaaa atcaattttc acaaatatga aaaattggtt atctaatact tggagcagta 

10921 tccgtacgaa tacaatagga aaagcgcagt cattatttag tggcgtcaaa tcaaaattta 

10981 ctaatttatg gaatgcgacg aaagaaattt ttagtaattt aagaaattgg atgtcaaata 

11041 tttggaattc cattaaagat aatacggtag gaattgcaag ccgtttatgg agtaaggtac 

11101 gtggaatttt cacaaatatg cgcgatggct tgagttccat tatagataag attaaaagtc 

11161 atatcggcgg tatggtaagc gctattaaaa aaggacttaa taaattaatc gacggtttaa 

11221 actgggtcgg tggtaagttg ggaatggata aaatacctaa gctacacact ggtacagagc 

11281 acacacatac tactacaaga ttagttaaga acggtaagat tgcacgtgac acattcgcta 

11341 cagttgggga taagggacgc ggaaatggtc caaatggttt tagaaatgaa atgattgaat . 

11401 tccctaacgg taaacgtgta atcacaccta atacagatac taccgcttat ttacctaaag 

11461 gctcaaaagt atacaacggt gcacaaactt attcaatgtc aaacggaacg cttccaagat 

11521 ttagtttagg tactatgtgg aaagatatta aatctggtgc atcatcggca tttaactgga 

11581 caaaagataa aataggtaaa ggtaccaaat ggcttggcga taaagttggc gatgttttag 

11641 attttatgga aaatccaggc aaacttttaa attatatact tgaagctttt ggaattgatt 

11701 tcaattcttt aactaaaggt atgggaattg caggcgacat aacaaaagct gcatggtcta 
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11761 agattaagaa aagtgctact gattggataa aagaaaattt agaagctatg ggcggtggcg 
11821 atttagtcgg cggaatatta gaccctgaca aaattaatta tcattatgga cgtaccgcag 
11881 cttataccgc tgcaactgga agaccatttc atgaaggtgt cgattttcca tttgtatatc 
11941 aagaagttag aacgccgatg ggtggcagac ttacaagaat gccatttatg tctggtggtt 
12001 atggtaatta tgtaaaaatt actagcggcg ttatcgatat gctatttgcg catttgaaaa 
12061 actttagcaa atcaccacct agtggcacga tggtaaagcc cggtgatgtt gttggtttaa 
12121 ctggtaatac cggatttagt acaggaccac atttacattt tgaaatgagg agaaatggac 
12181 gacattttga ccctgaacca tatttaagga atgctaagaa aaaaggaaga ttatcaatag 
12241 gtggtggcgg tgctacttct ggaagtggcg caacttatgc cagtcgagta atccgacaag 
12301 cgcaaagtat tttaggtggt cgttataaag gtaaatggat tcatgaccaa atgatgcgcg 
12361 ttgcaaaacg tgaaagtaac taccagtcaa atgcagtgaa taactgggat ataaatgctc 
12421 aaagaggaga cccatcaaga ggattattcc aaatcatcgg ctcaactttt agagcaaacg 
12481 ctaaacgtgg atatactaac tttaataatc cagtacatca aggtatctca gcaatgcagt 
12541 acattgttag acgatatggt tggggtggtt ttaaacgtgc tggtgattac gcatatgcta 
12601 caggtggaaa agtttttgat ggttggtata acttaggtga agacggtcat ccagaatgga 
12661 ttattccaac agatccagct cgtagaaatg atgcaatgaa gattttgcat tatgcagcag 
12721 cagaagtaag agggaaaaaa gcgagtaaaa ataagcgtcc tagccaatta tcagacttaa 
12781 acgggtttga tgatcctagc ttattattga aaatgattga acaacagcaa caacaaatag 
12841 ctttattact gaaaatagca caatctaacg atgtgattgc agataaagat tatcagccga 
12901 ttattgacga atacgctttt gataaaaagg tgaacgcgtc tatagaaaag cgagaaaggc 
12961 aagaatcaac aaaagtaaag tttagaaaag gaggaattgc tattcaatga tagacactat 
13021 taaagtgaac aacaaaacaa ttccttggtt gtatgtcgaa agagggtttg aaataccctc 
13081 ttttaattat gttttaaaaa cagaaaatgt agatggacgt tcggggtcta tatataaagg 
13141 gcgtaggctt gaatcttata gttttgacat acctttggtg gtacgtaatg actatttatc 
13201 tcacaacggc attaaaacac atgatgacgt cttgaatgaa ttagtaaagt tttttaacta 
13261 cgaggaacaa gttaaattac aattcaaatc taaagattgg tactggaacg cttatttcga 
13321 aggaccaata aagctgcaca aagaatttac aatacctgtt aagttcacta tcaaagtagt 
13381 actaacagac ccttacaaat attcagtaac aggaaataaa aatactgcga tttcagacca 
13441 agtttcagtt gtaaatagtg ggactgctga cactccttta attgttgaag cccgagcaat 
13501 taaaccatct agttacttta tgattactaa aaatgatgaa gattatttta tggttggtga 
13561 tgatgaggta accaaagaag ttaaggatta catgcctcct gtttatcata gtgagtttcg 
13621 tgatttcaaa ggttggacta agatgactac tgaagatatt ccaagtaatg acttaggtgg 
13681 taaggtcggc ggtgactttg tgatatccaa tcttggcgaa ggatataaag caactaattt 
13741 tcctgatgca aaaggttggg ttggtgctgg cacgaaacga gggctcccta aagcgatgac 
13801 agattttcaa attacctata aatgtattgt tgaacaaaaa ggtaaaggtg ccggaagaac 
13861 agcacaacat atttatgata gtgatggtaa gttacttgct tctattggtt atgaaaataa 
13921 atatcatgat agaaaaatag gacatattgt tgttacgttg tataaccaaa aaggagaccc 
13981 caaaaagata tacgactatc agaataaacc gataatgtat aacttggaca gaatcgttgt 
14041 ttatatgcgg ctcagaagag taggtaataa attttctatt aaaacttgga aatttgatca 
14101 cattaaagac ccagatagac gtaaacctat tgatatggat gagaaagagt ggatagatgg 
14161 cggtaagttt tatcagcgtc cagcttctat catagctgtc tatagtgcga agtataacgg 
14221 ttataagtgg atggagatga atgggttagg ttcatccaat acggagattc taccgaaacc 
14281 gaaaggcgca agggatgtca ttatacaaaa aggtgattta gtaaaaatag atatgcaagc 
14341 aaaaagtgtt gtcatcaatg aggaaccaat gttgagcgag aaatcgtttg gaagtaatta 
14401 tttcaatgtt gattctgggt acagtgaatt aatcatacaa cctgaaaacg tctttgatac 
14461 gacggttaaa tggcaagata gatatttata gaaaggagat gagagtgtga tacatgtttt 
14521 agattttaac gacaagatta tagatttcct ttctactgat gacccttcct tagttagagc 
14581 gattcataaa cgtaatgtta atgacaattc agaaatgctt gaactgctca tatcatcaga 
14641 aagagctgaa aagttccgtg aacgacatcg tgttattata agggattcaa acaaacaatg 
14701 gcgtgaattt attattaact gggttcaaga tacgatggac ggctacacag agatagaatg 
14761 tatagcgtct tatcttgctg atataacaac agctaaaccg tatgcaccag gcaaatttga 
14821 gaaaaagaca acttcagaag cattgaaaga tgtgttgagc gatacaggtt gggaagtttc 
14881 tgaacaaacc gaatacgatg gcttacgtac tacgtcatgg acttcttatc aaactagata 
14941 tgaagtttta aagcaattat gtacaaccta taaaatggtt ttagattttt atattgagct 
15001 tagctctaat accgtcaaag gtagatatgt agtactcaaa aagaaaaaca gcttattcaa 
15061 aggtaaagaa attgaatatg gtaaagattt agtcgggtta actaggaaga ttgatatgtc 
15121 agaaatcaaa acagcattaa ttgctgtggg acctgaaaat gacaaaggga agcgtttaga 
15181 gctagttgtg acagatgacg aagcgcaaag tcaattcaac ctacctatgc gctatatttg 
15241 ggggatatat gaaccacaat cagatgatca aaatatgaat gaaacacgat taagttcttt 
15301 agccaaaaca gagttaaata aacgtaagtc ggcagttatg tcatatgaga ttacttctac 
15361 tgatttggaa gttacgtatc cgcacgagat tatatcaatt ggcgatacag tcagagtaaa 
15421 acatagagat tttaacccgc cattgtatgt agaggcagaa gttattgctg aagaaeataa . 
15481 cataatttca gaaaatagca catatacatt cggtcaacct aaagagttca aagaatcaga 
15541 attacgagaa gagtttaaca agcgattgaa cataatacat caaaagttaa acgataatat 
15601 tagcaatatc aacactatag ttaaagatgt tgtagatggt gaattagaat actttgaacg 
15661 caaaatacac aaaagtgata caccgccaga aaatccagtc aatgatatgc tttggtatga 
15721 tacaagtaac cctgatgttg ctgtcttgcg tagatattgg aatggtcgat ggattgaagc 
15781 aacaccaaat gatgtcgaaa aattaggtgg tataacaaga gagaaagcgc tattcagtga 
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15841 attaaacaat atttttatta atttatctat acaacacgct agtcttttgt cagaagctac 

15901 agaattactg aatagcgagt acttagtaga taatgatttg aaagcggact tacaagcaag 

15961 tttagacgct gtgattgatg tttataatca aattaaaaat aatttagaat ctatgacacc 

16021 cgaaactgca acgattggtc ggttggtaga tacacaagct ttatttcttg agtatagaaa 

16081 gaaattacaa gatgtttata cagatgtaga agatgtcaaa atcgccattt cagatagatt 

16141 taaattatta cagtcacaat acactgatga aaaatataaa gaagcgttgg aaataatagc 

16201 aacaaaattt ggtttaacgg tgaatgaaga tttgcagtta gtcggagaac ctaatgttgt 

16261 taaatcagct attgaagcag ctagagaatc cacaaaagaa caattacgtg actatgtaaa 

16321 aacatcggac tataaaacag acaaagacgg tattgttgaa cgttcagata ctgctgaagc 

16381 tgagagaacg actttaaaag gtgaaatcaa agataaagtt acgttaaacg aatatcgaaa 

16441 cggattggaa gaacaaaaac aatatactga tgaccagtta agtgatttgt ccaataatcc 

16501 tgagattaaa gcaagtattg aacaagcaaa tcaagaagcg caagaagctt taaaatcata 

16561 cattgatgct caagatgatc ttaaagagaa ggaatcgcaa gcgtatgctg atggtaaaat 

16621 ttcggaagaa gagcaacgcg ctatacaaga tgctcaagct aaacttgaag aggcaaaaca 

16681 aaacgcagaa ctaaaggcta gaaacgctga aaagaaagct aatgcttata cagacaacaa 

16741 ggtcaaagaa agcacagatg cacagaggaa aacattgact cgctatggtt ctcaaattat 

16801 acaaaatggt aaggaaatca aattaagaac tactaaagaa gagtttaatg caaccaatcg 

16861 tacactttca aatatattaa acgagattgt tcaaaatgtt acagatggaa caacaatcag 

16921 atatgatgat aacggagtgg ctcaagcttt gaatgcgggg ccacgtggta ttagattaaa 

16981 tgctgataaa attgatatta acggtaatag agaaataaac cttcttatcc aaaatatgcg 

17041 agataaagta gataaaaccg atattgtcaa cagtcttaat ttatcaagag agggtcttga 

17101 tatcaatgtt aatagaattg gaattaaagg cggtgacaat aacagatatg ttcaaataca 

17161 gaatgactct attgaactag gtggtattgt gcaacgtact tggagaggga aacgttcaac 

17221 agacgatatt tttacgcgac tgaaagacgg tcacctaaga tttagaaata acaccgctgg 

17281 cggttcactt tatatgtcac attttggtat ttcgacttat attgatggtg aaggtgaaga 

17341 cggtggttca tccggtacga ttcaatggtg ggataaaact tacagtgata gtggcatgaa 

17401 tggtataaca atcaattcct atggtggtgt cgttgcacta acgtcagata ataatcgggt 

17461 tgttctggag tcttacgctt catcgaatat caaaagcaaa caggcaccgg tgtatttata 

17521 tccaaacaca gacaaagtgc ctggattaaa ccgatttgca ttcacgctgt ctaatgcaga 

17581 taatgcttat tcgagtgacg gttatattat gtttggttct gatgagaact atgattacgg 

17641 tgcgggtatc aggttttcta aagaaagaaa taaaggtctt gttcaaattg ttaatggacg 

17701 atatgcaaca ggtggagata caacaatcga agcagggtat ggcaaattta atatgctgaa 

17761 acgacgtgat ggtaataggt atattcatat acagagtaca gacctactgt ctgtaggttc 

17821 agatgatgca ggagatagga tagcttctaa ctcaatttat agacgtactt attcggccgc 

17881 agctaatttg catattactt ctgctggcac aattgggcgt tcgacatcag cgcgtaaata 

17941 caagttatct atcgaaaatc aatataacga tagagatgaa caactggaac attcaaaagc 

18001 tattcttaac ttacctatta gaacgtggtt tgataaagct gagtctgaaa ttttagctag 

18061 agagctgaga gaagatagaa aattatcgga agacacctat aaacttgata gatacgtagg 

18121 tttgattgct gaagaggtgg agaatttagg attaaaagag tttgtcacgt atgatgacaa 

18181 aggagaaatt gaaggtatag cgtatgatcg tctatggatt catcttatcc ctgttatcaa 

18241 agaacaacaa ctaagaatca agaaactgga ggagtcaaag aatgcaggat aacaaacaag 

18301 gattacaagc taatcctgaa tatacaattc attatttatc acaggaaatt atgaggttaa 

18361 cacaagaaaa cgcgatgtta aaagcgtata tacaagaaaa taaagaaaat caacaatgtg 

18421 ctgaggaaga gtaatcctta gcactatttt tatacaaaaa tttaaggagg tcatttaatt 

18481 atggcaaaag aaattatcaa caatacagaa aggtttattt tagtacaaat cgacaaagaa 

18541 ggtacagaac gtgtagtata ccaagatttc acaggaagtt ttacaacttc tgaaatggtt 

18601 aaccatgctc aagattttaa atctgaagaa aacgctaaga aaattgcgga gacgttaaat 

18661 ttgttatatc aattaactaa caaaaaacaa cgtgtgaaag tagttaaaga agtagttgaa 

18721 agatcagatt tatctccaga ggtaacagtt aacactgaaa cagtatgaaa agctatgagt 

18781 tagatactca tagtctttat tcttttagaa agcgggtgta ctgaattggg gtggttcaaa 

18841 aaacacgaac atgaatggcg catcagaagg ttagaagaga atgataaaac aatgctcagc 

18901 acactcaacg aaattaaatt aggtcaaaaa acccaagagc aagttaacat taaattagat 

18961 aaaaccttag atgctattca aaaagaaaga gaaatagatg aaaagaataa gaaagaaaat 

19021 gataagaaca tacgtgatat gaaaatgtgg gtgcttggtt tagttgggac aatatttggg 

19081 tcgctaatta tagcattatt gcgtatgctt atgggcatat aagagaggtg attaccatgt 

19141 tcggattaaa ttttggagct tcgctgtgga cgtgtttctg gtttggtaag tgtaagtaat 

19201 agttaagagt cagtgcttcg gcactggctt tttattttgg ataaaaggag caaacaaatg 

19261 gatgcaaaag taataacaag atacatcgta ctgatcttag cattagtaaa tcaattctta 

19321 gcgaacaaag gtattagccc aattccagta gacgatgaaa ctatatcatc aataatactt 

19381 actgtagtcg ctttatatac aacgtataaa gacaatccaa catctcaaga aggtaaatgg 

19441 gcaaatcaaa aattaaagaa atataaagct gaaaataagt atagaaaagc aacagggcaa 

19501 gcgccaatta aagaagtaat gacacctacg aatatgaacg acacaaatga tttagggtag 

19561 gtggttgata tatgttaatg acaaaaaatc aagcagaaaa atggtttgac aattcattag 

19621 ggaaacaatc caacccagat ggttggtatg gatttcagtg ttatgattac gccaatatgt 

19681 tctttatgtt agcgacaggc gaaaggctgc aaggtttata tgcttataat atcccgtttg 

19741 ataataaagc aaagattgaa aaatatggtc aaataattaa aaactatgac agctttttac 

19801 cgcaaaagtt ggatattgtc gttttcccgt caaagtatgg tggcggagct ggacacgttg 

19861 aaattgttga gagcgcaaat ttaaatactt tcacatcatt tggtcaaaac tggaacggta 



WO 00/32825 



146 

19921 aaggttggac taatggcgtt gcgcaacctg gttggggtcc. tgaaactgtg acaagacatg 

19981 ttcattatta tgacaatcca atgtatttta ttaggttaaa cttccctaac aacttaagcg 

20041 ttggcaataa agctaaaggt attattaagc aagcgactac aaaaaaagag gcagtaatta 

20101 aacctaaaaa aattatgctc gtagccggtc atggttataa cgatcctgga gcagtaggaa 

20161 acggaacaaa cgaacgcgat tttatacgta aatatataac gcctaatatc gctaagtatt 

20221 taagacatgc aggacatgaa gttgcattat acggtggctc aagtcaatca caagatatgt 

20281 atcaagatac tgcatacggt gttaatgtag gcaataaaaa agattatggc ttatattggg 

20341 ttaaatcaca ggggtatgac attgttctag aaatacattt agacgcagca ggagaaagcg 

20401 caagtggtgg gcatgttatt atctcaagtc aattcaatgc agatactatt gataaaagta 

20461 tacaagatgt tattaaaaat aacttaggac aaataagagg tgtgacacct cgtaatgatt 

20521 tactaaatgt taatgtatca gcagaaataa atataaatta tcgtttatct gaattaggtt 

20581 ttattactaa taaaaatgat atggattgga ttaagaaaaa ctatgacttg tattctaaat 

20641 taatagccgg tgcgattcat ggtaagccta taggtggttt ggtagctggt aatgttaaaa 

20701 catcagctaa aaacaaaaaa aatccaccag tgccagcagg ttatacactc gataagaata 

20761 atgtccctta taaaaaagaa caaggcaatt acacagtagc taatgttaaa ggtaataatg 

20821 taagagacgg ttattcaact aattcaagaa ttacaggggt attacccaac aacacaacaa 

20881 ttacgtatga cggtgcatat tgtattaatg gttatagatg gattacttat attgctaata 

20941 gtggacaacg tcgttatata gcgacaggag aggtagacaa ggcaggtaat agaataagta 

21001 gttttggtaa gtttagcacg atttagtatt tacttagaat aaaaattttg ctacattaat 

21061 tatagggaat cttacagtta ttaaataact atttggatgg atgttaatat tcctatacac 

21121 tttttaacat ttctctcaag atttaaatgt agataacagg caggtacttc ggtacttgcc 

21181 tattttttta tgttatagct agccttcggg ctagtttttt gttatgatgt gctacacatg 

21241 catcaactat ttacatctat ccttgttcac ccaagcatgt cactggatgt tctttcttgc 

21301 gatagagagc atagttttca tactactccc cgtagtatat atgactttag cattcccgta 

21361 taacagttta cggggtgctt ttatgttata attgctttta tatagtagga gtgaactata 

21421 tagccgggca gaggccatgt atctgactgt tggtcccaca ggagacatct tccttgtcat 

21481 cactcgatac atatatctta acaacataga aatgttacat tcgctataac cgtatcttaa 

21541 tcgatacggt tatatttatt cccctacaac caacaaaacc acagatccta ttaatttagg 

21601 attgtggtta ttttttgcgt ttttttgggg caaaaaaagg gcagattatt tgaaaaaggg 

21661 caaacgcttg tggaaaagct aaaaggttaa aaatgacaaa aaccttgata caacagtgtt 

21721 tttggacgct cgtgtacgtt agagaatgac cggtttacca tcatacaagg gtgggattaa 

21781 cttgtgttaa aaagccttta atatcagttg ttacaaagga tttgtagcgt ctttaaaaat 

21841 aaaaaagggc agaaaaaggg cagatacctt ttagtacaca agtttttcta atttttgctc 

21901 taactctctg tccattttct ctgttacatg tgtatacacc tttatagtcg ttttttcatc 

21961 tgtatgtcct actcttttca taattgcttt taacgatata ttcatttccg ccaataaact 

22021 tatgtgtgta tgccttagtg tgtgagtagt aactttttta tttatattta atgattctgc 

22081 agctgaggac aatcgtttgt ttatcctact gccttgcata ggatttcctt ggcaagttgt 

22141 gaatataaac cctctatcaa catagcttgg ttcccattgt tgcatctttt tattttctaa 

22201 cattattttt ttcaatacat ttgctatcct tgaattgatg gcgatttttc ttcttgaacc 

22261 tgcggtctta gtagtatctt tgtgaccaaa tccagcatta catttgattc tgtgaatagt 

22321 gccattaata gcgatcgttt tatttttgag gtcaacatct ttaacttgga gagctaataa 

22381 ctcacctatg cgcatacctg ttaaagcttg aacttctaca gccccagcaa ctaaaatacg 

22441 agctctatac tgcatgttat tatcgttcag tataaaatcg cgtatctgta ttacctgttc 

22501 catctctaaa tagttataca ttttcgcttc ttctttttct atatcttcta tcgtcttact 

22561 cttctttggt agtgtgacgc tatttaatat gtgttcgttt ggataattgt aaaatttaac 

22621 ggcgtattta atagcttctt tcatatgtcc aagttgacgc tttacctgat ttgcagaata 

22681 tacgtttgat aatttgttaa taaatgtttg catgtacttt gtatcaattt tgtttaaaag 

22741 taaattttga gaactgttct ttttgatgtt tttgattctt gttttcaaat tatcaagcgt 

22801 cgttacttta aagccagatg tttttatatg atattcaagc cattcatcta ataacgcgtg 

22861 aaaagtcaaa gtttttaatt cgcttgacga cttgttgttt agtttttctt ttattttttc 

22921 ttccaaacga aacattgcct ctttttgcga ttgctttgta ttcttattca agacaacact 

22981 tacacgtttc catttatctg tatacggatc tttgtatttc tcgtagtatc tatacttcgt 

23041 ttcattgttc ttatttttaa atttttcaaa ccacatttta catccctcct caaaattggc 

23101 aaaaaataat aagggtaggc gggctaccca tgaaaattgt ataaaaaaag acgcctgtat 

23161 aaaatacaga cgccacttat aatcacaaga ttacatggtt aattaccaaa aatggtaacg 

23221 aatatatacg tgttttaaag gataaacctt taatatatta aaattatatc atcttatatc 

23281 agggatctgc aatatattat tattaattct atttatcagt aacataatat ccgaagaatc 

23341 tattactgga tttttaattt tttggggtaa aacttttctt atgcgaaact tactaatcgg 

23401 ctggaaagaa tttatgcaag cgtaactatt accttttaat ttttttacct tatcaattgc 

23461 tgatactatg ttattaatgt ttctgtcaat tttatttaat ttattttcaa tttctaaact 

23521 atcagatata aattcaataa aataatcttt agtgatgaat tctgtgttgt ttttttggta 

23581 ttttttatcg aaaacttctt ttaatatagc tgaattattt tgcgcgctaa ttaaatttaa. 

23641 aaacaatctt aaataatact cccatttcaa atcaaaattc atctttaaat actttttgtt 

23701 ttctttagag gataagggaa taacatttac tatatcctcc gcattagaat catttttatt 

23761 catcactatt gcaaagtgtg aattagaaaa ttctttatta acgtttatac cgaaatctac 

23821 aaaaactatt tctccttgtt taaactttgg ataaaaacct ttatggtttt tttcaccttc 

23881 aaatctcttg agtaaatagt gaatatctga atctaacttt ttaaattttg gatttccaga 

23941 agtttttaat ttattaatgc gcttttctat attatgcgtc atcatttccc ctttattctc 
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cttctttggt 
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aaacttcata 
ccttttaaac 



tactttaatt ctttaatcca 



cttaaagtta 

cgctaaatat 

gaagcgactt 

catatatcta 

tctttaatag 

attgaatcac 

aaaaatactt 

gcaccacatg 

ttattctgtt 

gtgagttgag 

tcaggaactc 

gttttagata 

gataagtgac 

agaatatcta 

ttttgcattg 

caaaagttca 

ttatagttca 

ttttcgttat 

gctttaggtg 

acaacagaca 

tattttttta 

tgcaagaacg 

ttaggacaaa 

tacaacaaac 

gatgaaatta 

ttagacaagc 

aaagtgaatc 

cagatgtgag 

gcgataaccg 

ttacaaacat 

tattttgtag 

agaaatcatg 

agaaatatga 

caaagcaaaa 

gatgtcctac 

acattaaaag 

caaaacttac 

tcggtagctg 

aacggtgttg 

attaaaaaga 

ttggatatca 

ccaaaagtaa 

acatcttaaa 

aattatagca 

aattttcagt 

tttcgcaaaa 

aaagtatcag 

ccatattaga 

aagtgaatac 

ctatgaaaag 

atgttacaaa 

catgctagtt 

ttgttgaaaa 

tttgagatcc 

cttttgtaac 

catgtttttt 

gttgataaca 

tcgaacatcg 

ggtcgagaac 

atgcttaaat 

acagctcaag 

gaaatcgcaa 



agattgcttt 
acgttattaa 
tgatatcatc 
cacgcttgat 
aatcttcttt 
cattaactaa 
cttcatgcaa 
caatatacga 
catctaattg 
aaaatatgtt 
gataagaatc 
ataagaataa 
tttttgacat 
cttgacgcaa 
gtaatgcctc 
acttttttaa 
aatgtttgaa 
tgaacggtaa 
tatcagaaag 
ttattaaagc 
aacagaaagt 
agaaaaggtt 
ttgaagctta 
acctaagcat 
ctaaaaagct 
ttaacggtgt 
agtaacattc 
cgagagctgg 
tctgctgaat 
ttaattttaa 
gaaaagatat 
ttgatagcga 
tcattatcaa 
acgaaaaaat 
cagctattcg 
atccagacta 
ttttacaaca 
gtagtgataa 
atataggaca 
gtggagaaag 
aaaaacgaat 
caggcaaagg 
aggaggaaca 
aaagaagtta 
aaagtaagaa 
gatttgtcgc 
catggcttcg 
aaattaacat 
aacctagcag 
agagtttcag 
aatttagaat 
actgtttaga 
aggttagcta 
aacaaacaca 
agtttcaatt 
tgaaccatcc 
acattataca 
gagaaatgtt 
ttcatcaagc 
acggatttga 
gcaatatgac 
tgattcaacg 



tcaacgtcta 

taacttatcg 

cgataaatct 

tactatagtt 

actgctgaaa 

catatattta 

tttcatgtca 

tcacaataca 

atacttcgga 

aagacttact 

cttaataaaa 

aatacaaaaa 

tatgtcatca 

tactagttta 

ctcatttgca 

attgattttt 

tacatcatac 

tttatgttgg 

tttaatattc 

gttcctatct 

cttgaaattc 

ctttttgtgt 

cttaggaggt 

gatagtcgaa 

aactttgtct 

ttgtaagtta 

tcaaacgttt 

aataaaagta 

cgacaaaacg 

gaacaacgaa 

acgaagtgct 

tgaaagcgat 

acttcttaat 

cgatgatatg 

gtgggtgttg 

agagctacca 

tgctgagatt 

ggacaagctg 

cgaatcagga 

tagagaaacc 

caaacacggt 

catcattaca 

gcaagtagaa 

ttcaatactt 

aaacagattg 

ttataactta 

aattaataat 

acaacaatac 

caatggaaca 

gagaggctat 

tcaataatga 

taggaagatt 

aatcaattca 

tatcaatttt 

caaaagttca 

aattaactat 

tgcgaaagaa 

aagaaacaac 

aattcaacgg 

taagttttag 

ccttcatatc 

tttaaagtaa 

cgaaaggagc 

caatattcaa 

attagaagtt 

agaaaataca 

tcactatatt 

tagtgaacct 
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cacttgtagg. 
ccatctattt 
ttaggtaact 
tcttttttta 
tagacgtctt 
aaagtgaggt 
atttctcctt 
actttgccca 
tttagagata 
ccatctaata 
gcgtatgttc 
tcagcatttg 
tataattctt 
gactctttat 
tagttaagta 
gacattatcg 
cccataagcc 
tctggagaag 
aattcttttt 
ttcataattt 
attatatagg 
tgacattgtt 
gattatttga 
gtgtactcga 
ttgaagttga 
ttgggaatac 
gaacttaata 
acacatcttc 
cttaaagaaa 
gacgcattcg 
atcaaagagt 
aacaaagtta 
ataaccacgc 
agccgcgttt 
aggaaaaagg 
gtaagaacag 
ttaggatatg 
acgcaccaat 
ttatacagtc 
gctagaaaat 
atatacgcaa 
gtgttgactg 
gttaacaaac 
gttggagaac 
ttcaaatggt 
ccaactcaaa 
ccagatggtt 
tttgttaata 
aatcacatta 
aaatggcaag 
cgatttagaa 
gaggaagctc 
tcaaaaagct 
tggagtgaca 
tcgagaaatc 
cgatgatttc 
aaaaataaat 
aaccctgaac 
taaggatttg 
tagggtctag 
ctggaaaaac 
ctagaagttt 
ataaacaata 
gaaaaagaaa 
aagacagcat 
gattacacag 
gaccacgcac 
ggcaaacgtg 



cgttttttga 
tttgtgaaat 
cataagtgaa 
ttttgcaatt 
tttcaaataa 
agtaggtaat 
tgtttatatt 
ttactttaat 
ccaaattaat 
caacgagtgc 
cttgttttaa 
atggcgtttc 
ctcctatgcc 
attcatctat 
cgttttcttg 
tttcatcttg 
acgcttcacc 
accttccatt 
gaaagggttt 
gttttaatct 
aagggaaata 
caaaattggg 
atactaatac 
cacaatttaa 
acaacaaagt 
ctataaaaga 
agtaaaggag 
aaatgaagca 
taaagtacac 
ctggtttgga 
tccaaaaagt 
ctgatttaac 
ttatcaacat 
aaatacattc 
aggatactca 
tagaaattga 
caagatcaaa 
ttagtgcatc 
taatcttcga 
tcaaacgctg 
cagacaatgt 
agtataagaa 
caaaagtatt 
tagcgaaaat 
taagaaataa 
agagtatgga 
caagtaaagt 
agtttttagg 
accaaagaag 
aaaccaatca 
gaaatcaata 
aatcatccga 
tatgtacaag 
cttaattcag 
aaaaactatt 
gaataaagga 
taaaactcaa 
tgttgcgagc 
ccctgcctcc 
aaaaaatgtt 
aattttcttt 
catacttatc 
tgcaagcatt 
atggagaaat 
ataaagattg 
ctatcgctca 
tcacactaga 
caagacaata 



ttagtaaaat 
aaattccaag 
tggttgatta 
agttattttc 
gcatgattaa 
aaatataaga 
tatattaaag 
atcactaaac 
atagtcttcg 
aattgtacca 
cataggttcc 
gtcttcttta 
agcaccagtt 
agaagtgact 
gcggggaggt 
acgttcttcg 
gacatttaaa 
aacatactgg 
cgacttttct 
ttcagaagtg 
aaaatcaata 
gttatagtta 
aacttttgat 
ctttgctata 
accatggaaa 
tgttcacaaa 
gcataacaca 
tcaaaacctt 
tcgagacctt 
aatggtagag 
agtgaaagcg 
agagtggcgg 
ccacattgag 
gatagtcatt 
aatgcaagca 
aaacgaacct 
caatgccatt 
aggtcaaaac 
tgcttctaaa 
ggtaacatca 
aattgaacaa 
agaaaaagag 
attcgctgac 
acttaaacaa 
tggatatctc 
tctaaaaatc 
atcacgtaca 
agaaaaacaa 
agttgaaaga 
gttcaggttc 
aaaaactcaa 
ttccgctaaa 
atgttcatga 
acttgagtga 
atttatacat 
ggaacaacaa 
attactcaag 
agttgcagag 
acacttagag 
tcgatttcct 
aaatccgaaa 
acctccttag. 
acaaacaaat 
cgcaatcagc 
gtttccaaga 
aaaaagagca 
cactgcaaaa 
tttcatccaa 
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28081 gttgaaaaag catggaacag cccagaaatg attatgcaac .gtgctttaaa aattgctaac 
28141 aacacaatca atcaattaga aacaaagatt gcacgtgaca aaccaaaaat tgtatttgca 
28201 gatgcagtag ctactactaa gacatcaatt ttagttggag agttagcaaa gatcattaaa 
28261 caaaacggta taaacatcgg gcaacgcaga ttgtttgagt ggttacgtca aaacggattc 
28321 cttattaaac gcaagggtgt ggattataac atgcctacac agtattcaat ggaacgtgag 
28381 ttattcgaaa ttaaagaaac atcaatcaca cattcggacg gtcacacatc aattagtaag 
28441 acgccaaaag taacaggtaa aggacaacaa tactttgtta acaagttttt aggagaaaaa 
28501 caaacaactt aataggagga attacaaatg aacgcactat acaaaacaac cctcctcatc 
28561 acaatggcag ttgtgacgtg gaaggtttgg aagattgaga agcacactag aaaacctgtg 
28621 attagtagca gggcgttgag tgactatcta aacaacaaat ctttaaccat accgaaagat 
28681 gctgaaaatt ctactgaatc tgctcgtcgc cttttgaagt tcgccgaaca aactattagc 
28741 aaataacaac attatacacg aaaggaaaga tagaaatgcc aaaaatcata gtaccaccaa 
28801 caccagaaaa cacatataga ggcgaagaaa aatttgtgaa aaagttatac gcaacaccta 
28861 cacaaatcca tcaattgttt ggagtatgta gaagtacagt atacaactgg ttgaaatatt 
28921 accgcaaaga taatttaggt gtagaaaatt tatacattga ttattcacca acaggcactc 
28981 tgattaatat ttctaaattg gaagagtatt tgatcagaaa gcataaaaaa tggtattagg 
29041 aggatattaa atgagcaaca tttataaaag ctacctagta gcagtattat gcttcacagt 
29101 cttagcgatt gtacttatgc cgtttctata cttcactaca gcatggtcaa ttgcgggatt 
29161 cgcaagtacc gcaacattca tgtactacaa agaatgcttt ttcaaagaat aaaaaaactg 
29221 ctacttgttg gagcaagtaa cagtatcaaa cacttaagaa aaaattcatg ttcaatataa 
29281 aacgaaaaac ggaggaagtc aagatgtatt acgaaatagg cgaaatcata cgcaaaaata 
29341 ttcatgttaa cggattcgat tttaagctat tcattttaaa aggtcatatg ggcatatcaa 
29401 tacaagttaa agatatgaac aacgtaccaa ttaaacatgc ttatgtcgta gatgagaatg 
29461 acttagatac ggcatcagac ttatttaacc aagcaataga tgaatggatt gaagagaaca 
29521 cagacgaaca ggacagacta attaacttag tcatgaaatg gtaggaggtc gctatgaagc 
29581 agactgtaac ttatatcatt cgtcataggg atatgccaat ttatataact aacaaaccaa 
29641 ctgataacaa ttcagatatt agttactcca caaatagaaa tagagctagg gagtttaacg 
29701 gtatggaaga agcgagtatc aatatggatt atcacaaagc aatcaagaaa acagtgacag 
29761 aaactattga gtacgaggag gtagaacatg actgaggaaa aacaagaacc acaagaaaaa 
29821 gtaagcatac tcaaaaaact aaagataaat aatatcgctg agaaaaataa aaggaaattc 
29881 tataaatttg cagtatacgg aaaaattggc tcaggaaaaa ccacgtttgc tacaagagat 
29941 aaagacgctt tcgtcattga cattaacgaa ggtggaacaa cggttactga cgaaggatca 
30001 gacgtagaaa tcgagaacta tcaacacttt gtttatgttg taaatttttt acctcaaatt 
30061 ttacaggaga tgagagaaaa cggacaagaa atcaatgttg tagttattga aactattcaa 
30121 aaacttagag atatgacatt gaatgatgtg atgaaaaata agtctaaaaa accaacgttt 
30181 aatgattggg gagaagttgc tgaacgaatt gtcagtatgt acagattaat aggaaaactt 
30241 caagaagaat acaaattcca ctttgttatt acaggtcatg aaggtatcaa caaagataaa 
30301 gatgatgaag gtagcactat caaccctact atcactattg aagcgcaaga acaaattaaa 
30361 aaagctatta cttctcaaag tgatgtgtta gctagggcaa tgattgaaga atttgatgat 
30421 aacggagaaa agaaagctag atatattcta aacgctgaac cttctaatac gtttgaaaca 
30481 aagattagac attcaccttc aataacaatt aacaataaga aatttgcaaa tcctagcatt 
30541 acggacgtag tagaagcaat tagaaatgga aactaaaaat taattaaaag gacggtattt 
30601 aattatgaaa atcacaggac aagcgcaatt tactaaagaa acaaatcaag aaaagtttta 
30661 taacggctca gcagggtttc aagctggaga attcacagtg aaagttaaaa atattgaatt 
30721 caatgataga gaaaatagat atttcacaat cgtatttgaa aatgatgaag gcaaacaata 
30781 taaacataat caatttgtac cgccgtataa atatgatttc caagaaaaac aattgattga 
30841 attagttact cgattaggta ttaagttaaa tcttcctagc ttagattttg ataccaatga 
30901 tcttattggt aagttttgtc acttggtatt gaaatggaaa ttcaatgaag atgaaggtaa 
30961 gtattttacg gatttttcat ttattaaacc ttacaaaaag ggcgatgatg ttgttaacaa 
31021 acctattccg aagacagata agcaaaaagc tgaagaaaat aacggggcac aacaacaaac 
31081 atcaatgtct caacaaagca atccatttga aagcagtggc caatttggat atgacgacca 
31141 agatttagcg ttttaaggtg tggtttaaat gcaatacatt acaagatacc agaaagataa 
31201 cgacggtact tattccgtcg ttgctactgg tgttgaactt gaacaaagtc acattgactt 
31261 actagaaaac ggatatccac taaaagcaga agtagaggtt ccggacaata aaaaactatc 
31321 tatagaacaa cgcaaaaaaa tattcgcaat gtgtagagat atagaacttc actggggcga 
31381 accagtagaa tcaactagaa aattattaca aacagaattg gaaattatga aaggttatga 
31441 agaaatcagt ctgcgcgact gttctatgaa agttgcaagg gagttaatag aactgattat 
31501 agcgtttatg tttcatcatc aaatacctat gagtgtagaa acgagtaagt tgttaagcga 
31561 agataaagcg ttattatatt gggctacaat caaccgcaac tgtgtaatat gcggaaagcc 
31621 tcacgcagac ctggcacatt atgaagcagt cggcagaggc atgaacagaa acaaaatgaa 
31681 ccactatgac aaacatgtat tagcgttatg tcgcgaacat cacaacgagc aacatgcgat 
31741 tggcgttaag tcgtttgatg ataaatacca cttgcatgac tcgtggataa aagttgatga . 
31801 gaggctcaat aaaatgttga aaggagagaa aaaggaatga atagactaag aataataaaa 
31861 atagcactcc taatcgtcat cttggcggaa gagattagaa atgctatgca tgctgtaaaa 
31921 gtggagaaaa ttttaaaatc tccgtttagt taatacaggt ttttacaaaa gctttaccat 
31981 aggcggacaa actaattgag ccttttttga tgtctattac ccaggggctg taatgtaact 
32041 ttaatacttc aaattcaatg ccagaaagtt tacttattgt ttctaggttg tgtcctgact 
32101 ttaacattct tttaacaaat tctaatcccg aaacaaatct ttgtttttct ataatcttat 
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32161 taaagtgatt taaaaactga ggagcataaa acttactata aattcctttt cttgttaagt 
32221 aagacatgtc aaaagtttca tttaaaaccc ctaaccttac taggttatta attgaaattt 
32281 cggttgattc tatatctaac ggagagtctt ttattaacgt gtccgatata ttcataccgt 
32341 cattctttgg gtttaaaacc gctctatatt taacggcagg atgtacttcg tgattcttta 
32401 aatgttttaa aagaatagca tcatttgggg ataattgttt aattatttca acaaatgaat 
32461 ggtgggttaa tgagtttttt ctgtcatcca tagatgatgc tattagtttt gcgaacatat 
32521 tacttaaagt tttttcacta atgtaaaact ttgaagcttc tagagcagga cctagaagag 
32581 aaaattgtgg ttcttgtaaa tcatttttag gtacagaaga tatttctttt ttaaattgtt 
32641 ctttgaattt ttcaaattct acttctcttt gataaataac tttatccaca taaaggtgga 
32701 atttcccaaa gacaagttcc caagttttag agaatgtttc tacaggccct tttgatgcgc 
32761 cttcaataat tttatcaata cctttaccta aaataggatc cataattatt cacccccaat 
32821 ctaacgcaat agcgataata aaattatacc agaaaggaga atcaacatga ctgaccaacc 
32881 aagttactac tcaataatta cagcaaatgt cagatacgat aaccgactta ctgacagcga 
32941 aaagttactt tttgcagaaa taacatcttt aagtaacaaa tacggatact gcacagcaag 
33001 taatggttac tttgcaactt tatacaacgt tgttaaggaa actatatctc gtagaatttc 
33061 gaaccttacc aactttggtt atctaaaaat cgaaattatc aaagaaggta atgaagttaa 
33121 acaaaggaag atgtacccct tgacgcaaac gtcaatacct attgacgcaa aaatcaatac 
33181 ccctattgat aattctgtca atacccctat tgacgcaaat gtcaaagaga atattacaag 
33241 tattaataat acaagtaata acaacataaa tagaatagat atattgtcgg gcaacccgac 
33301 agcatcttct ataccctata aagaaattat cgattactta aacaaaaaag cgggcaagca 
33361 ttttaaacac aatacagcta aaacaaaaga ttttattaaa gcaagatgga atcaagattt 
33421 taggttggag gattttaaaa aggtgattga tatcaaaaca gctgagtggc taaacacgga 
33481 tagcgataaa taccttagac cagaaacact ttttggcagt aaatctgagg ggtacctcaa 
33541 tcaaaaaata caaccaactg gcacggatca attggaacgc atgaagtacg acgaaagtta 
33601 ttgggattag ggggatatta tgaaaccact attcagcgaa aagataaacg aaagcttgaa 
33661 aaaatatcaa cctactcatg tcgaaaaagg attgaaatgt gagagatgtg gaagtgaata 
33721 cgacttatat aagtttgctc ctactaaaaa acacccgaat ggctacgagt ataaagacgg 
33781 ttgcaaatgt gaaatctatg aggaatataa gcgaaacaag caacggaaga taaacaacat 
33841 attcaatcaa tcaaacgtta atccgtcttt aagagatgca acagtcaaaa actacaagcc 
33901 acaaaatgaa aaacaagtac acgctaaaca aacagcaata gagtacgtac aaggcttctc 
33961 tacaaaagaa ccaaaatcat taatattgca aggttcatac ggaactggta aaagccacct 
34021 agcatacgct atcgcaaaag cagtcaaagc taaagggcat acggttgctt ttatgcacat 
34081 accaatgttg atggatcgta tcaaagcgac atacaacaaa aatgcagtag agactacaga 
34141 cgagctagtc agattgctaa gtgatattga tttacttgta ctagatgata tgggtgtaga 
34201 aaacacagag cacactttaa ataaactttt cagcattgtt gataacagag taggtaaaaa 
34261 caacatcttt acaactaact ttagtgataa agaactaaat caaaatatga actggcaacg 
34321 tataaattcg agaatgaaaa aaagagcaag aaaagtaaga gtaatcggag acgatttcag 
34381 ggagcgagat gcatggtaac caaagaattt ttaaaaacta aacttgagtg ttcagatatg 
34441 tacgctcaga aactcataga tgaggcacag ggcgatgaaa ataggttgta cgacctattt 
34501 atccaaaaac ttgcagaacg tcatacacgc cccgctatcg tcgaatatta aggagtgtta 
34561 aaaatgccga aagaaaaata ttacttatac cgagaagatg gcacagaaga tattaaggtc 
34621 atcaagtata aagacaacgt aaatgaggtt tattcgctca caggagccca tttcagcgac 
34681 gaaaagaaaa ttatgactga tagtgaccta aaacgattca aaggcgctca cgggcttcta 
34741 tatgagcaag aattaggttt acaagcaacg atatttgata tttagaggtg gacgatgagt 
34801 aaatacaacg ctaagaaagt tgagtacaaa ggaattgtat ttgatagcaa agtagagtgt 
34861 gaatattacc aatatttaga aagtaatatg aatggcacta attatgatca tatcgaaata 
34921 caaccgaaat tcgaattatt accaaaacta gataaacaac gaaagattga atatattgca 
34981 gacttcgcgt tatatctcga tggcaaactg attgaagtta tcgacattaa aggtatgcca 
35041 accgaagtag caaaacttaa agctaagatt ttcagacata aatacagaaa cataaaactc 
35101 aattggatat gtaaagcgcc taagtataca ggtaaaacat ggattacgta cgaggaatta 
35161 attaaagcaa gacgagaacg caaaagagaa atgaagtgat ctaatgcaac aacaagcata 
35221 tataaatgca acgattgata taaggatacc tacagaagtt gaatatcagc attttgatga 
35281 tgtggataaa gaaaaagaag cgctggcaga ttacttatat aacaatcctg acgaaatact 
35341 agagtatgac aatttaaaaa ttagaaacgt aaatgtagag gtggaataaa tgggcagtgt 
35401 tgtaatcatt aataataaac catataaatt taacaatttt gaaaaaagaa ataatggcaa 
35461 agcgtgggat aaatgctgga attgtttcta aacgtgttag aggttgttgg gagttttcag 
35521 aagctttaga cgcgccttat ggcatgcacc taaaagaata tagagaaatg aaacaaatgg 
35581 aaaagattaa acaagcgaga ctcgaacgtg aattggaaag agagcgaaag aaagaggctg 
35641 agctacgtaa gaagaagcca catttgttta atgtacctca aaaacattca cgtgatccgt 
35701 actggttcga tgtcacttat aaccaaatgt tcaagaaatg gagtgaagca taatgagcat 
35761 aatcagtaac agaaaagtag atatgaacaa aacgcaagac aacgttaagc aacctgcgca 
35821 ttacacatac ggcgacattg aaattataga ttttattgaa caagttacgg cacagtaccc. 
35881 accacaatta gcattcgcaa taggtaatgc aattaaatac ttgtctagag caccgttaaa 
35941 gaatggtcat gaggatttag caaaggcgaa gttttacgtc gatagagtat ttgacttgtg 
36001 ggagtgatga ccatgacaga tagcggacgt aaagaatact taaaacattt tttcggctct 
36061 aagagatatc tgtatcagga taacgaacga gtggcacata tccatgcagt aaatggcact 
36121 tattactttc acggtcatat cgtgccaggt tggcaaggtg tgaaaaagac atttgataca 
36181 gcggaagagc ttgaaacata tataaagcaa agtgatttgg aatatgagga acagaagcaa 
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36241 ctaactttat tttaaaaggg cggaaacaat gaaaatcaaa attgaaaaag aaatgaattt 
36301 acctgaactt atccaatggg cttgggataa ccccaagtta tcaggtaata aaagattcta 
36361 ttcaaatgat gttgagcgca actgttttgt gacttttcat gttgatagca tcttatgtaa 
36421 tgtgactgga tatgtatcaa ttaacgataa atttactgtt caagaggaga tataacaatg 
36481 aaaatcaaag ttaaaaaaga aatgagatta gatgaattaa ttaaatgggc gcgagaaaat 
36541 ccggatctat cacaaggaaa aatatttttt tcaacaggat ttagtgatgg attcgttcgt 
36601 tttcatccaa atacaaataa gtgttcgacg tcaagtttta ttccaattga tatccccttc 
36661 atagttgata ttgaaaaaga agtaacggaa gagactaagg ttgataggtt gattgaatta 
36721 ttcgagattc aagaaggaga ctataactct acactatatg agaacactag tataaaagaa 
36781 tgtttatatg gcagatgtgt gcctaccaaa gcattctaca tcttaaacga tgacctaact 
36841 atgacgttaa tctggaaaga tggggagttg ctagtatgat gttgaaattt aaagcttggg 
36901 ataaagataa aaaagttatg agtattattg acgaaatcga ttttaatagt gggtacattt 
36961 tgatttcaac aggttataaa agtttcaatg aagtaaaact attacaatac acaggattta 
37021 aagatgtgca cggtgtggag atttatgaag gggatattgt tcaagattgt tattcgagag 
37081 aagtaagttt tatcgagttt aaagaaggag ccttttatat aacttttagc aatgtaactg 
37141 aattactaag tgaaaatgac gatattattg aaattgttgg aaatattttt gaaaatgaga 
37201 tgctattgga ggttatgaga tgacgttcac cttatcagat gaacaatata aaaatctttg 
37261 tactaactct aacaagttat tagataaact tcacaaagca ttaaaagatc gtgaagagta 
37321 caagaagcaa cgagatgagc ttattgggga tatagcgaag ttacgagatt gtaacaaaga 
37381 tctagagaag aaagcaagcg catgggatag gtattgcaag agcgttgaaa aagatttaat 
37441 aaacgaattc ggtaacgatg atgaaagagt taaattcgga atggaattaa acaataaaat 
37501 ttttatggag gatgacacaa atgaataatc gcgaaaaaat cgaacagtcc gttattagtg 
37561 ctagtgcgta taacggtaat gacacagagg ggttgctaaa agagattgag gacgtgtata 
37621 agaaagcgca agcgtttgat gaaatacttg agggaatgac aaatgctatt caacattcag 
37681 ttaaagaagg tattgaactt gatgaagcag tagggattat ggcaggtcaa gttgtctata 
37741 aatatgagga ggaataggaa aatgactaac acattacaag taaaactatt atcaaaaaat 
37801 gctagaatgc ccgaacgaaa tcataagacg gatgcaggtt atgacatatt ctcagctgaa 
37861 actgtcgtac tcgaaccaca agaaaaagca gtgatcaaaa cagatgtagc tgtgagtata 
37921 ccagagggct atgtcggact attaactagt cgtagtggtg taagtagtaa aacgtattta 
37981 gtgattgaaa caggcaagat agacgcggga tatcatggca atttagggat taatatcaag 
38041 aatgatgaag aacgtgatgg aatacccttt ttatatgatg atatagacgc tgaattagaa 
38101 gatggattaa taagcatttt agatataaaa ggtaactatg tacaagatgg aagaggcata 
38161 agaagagttt accaaatcaa caaaggcgat aaactagctc aattggttat cgtgcctata 
38221 tggacaccgg aactaaagca agtggaggaa ttcgaaagtg tttcagaacg tggagcaaaa 
38281 ggcttcggaa gtagcggagt gtaaagacat cttagatcga gttaaggagg ttttggggaa 
38341 gtgacgcaat acttagtcac aacattcaaa gattcaacag gacgaccaca tgaacatatt 
38401 actgtggcta gagataatca gacgtttaca gttattgagg cagagagtaa agaagaagcg 
38461 aaagagaagt acgaggcaca agttaaaaga gatgcagtta ttaaagtggg tcagttgtat 
38521 gaaaatataa gggagtgtgg gaaatgacgg atgttaaaat taaaactatt tcaggtggag 
3B581 tttattttgt aaaaacagct gaaccttttg aaaaatatgt tgaaagaatg acgagtttta 
38641 atggttatat ttacgcaagt actataatca agaaaccaac gtatattaaa acagatacga 
38701 ttgaatcaat cacacttatt gaggagcatg ggaaatgaat cagctgagaa ttttattaca 
38761 tgacggtagt agtttgatat tacatgaaga tgaattattt aacgaaatag tatttgtttt 
38821 ggacaatttt agaaatgatg atgactattt aacgatagaa aaagattatg gcagagaact 
38881 tgtattgaac aaaggttata tagttgggat caatgttgag gaggcagatg atgattaaca 
38941 tacctaaaat gaaattcccg aaaaagtaca ctgaaataat caaaaaatat aaaaataaag 
39001 cacctgaaga aaaggctaag attgaagatg attttattaa agaaattaaa gataaagaca 
39061 gtgaatttta cagtcctacg atggctaata tgaatgaata tgaattaagg gctatgttaa 
39121 gaatgatgcc tagtttaatt gatactggag atgacaatga tgattaaaaa acttaaaaat 
39181 atggatgggt tcgacatctt tattgttgga atactgtcat tattcggtat attcgcattg 
39241 ctacttgtta tcacattgcc tatctataca gtggctagtt accaacacaa agaattacat 
39301 caaggaacta ttacagataa atataacaag agacaagata aagaagacaa gttctatatt 
39361 gtattagaca acaaacaagt cattgaaaat tccgacttat tattcaaaaa gaaatttgat 
39421 agcgcagata tacaagctag gttaaaagta ggcgataagg tagaagttaa aacaatcggt 
39481 tatagaatac actttttaaa tttatatccg gtcttatacg aagtaaagaa ggtagataaa 
39541 caatgattaa acaaatacta agactattat tcttactagc aatgtatgag ttaggtaagt 
39601 atgtaactga gcaagtgtat attatgatga cggctaatga tgatgtagag gcgccgagtg 
39661 attacgtctt tcgagcggag gtgagtgaat aatgagaata tttatttatg atttgatcgt 
39721 tttgctgttt gctttcttaa tatccatata tattattgat gatggagtga taataaatgc 
39781 attaggaatt tttggtatgt ataaaattat agattccttt tcagaaaata ttataaagag 
39841 gtagataaaa atgaacgagc aaataatagg aagcatatat actttagcag gaggtgttgt 
39901 gctttattca gttaaagaga tttttaggta ttttacagat tctaact.tac aacgtaaaaa_ 
39961 aatcaattta gaacaaatat atccgatata tttagattgt tttaaaaagg ctaaaaagat 
40021 gattggagct tatattattc caacagaaca gcatgaattt ttagattttt ttgatattga 
40081 agtctttaat aatttagata agcaaagtaa aaaagcgtat gaaaatgtta ttggatttag 
40141 acaaatgatt aatttatcaa atagagttaa ggcaatggaa gattttaaga tgagtttcaa 
40201 caatgaattt agtacaaatc agattttttt taatccttct tttgttatgg aaacaattgc 
40261 tattataaat gaatatcaaa aagatatatc ttatttaaaa aatataatta ataaaatgaa 
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40321 tgaaaataga gcttataatc atattgatag ttttatcact .tcagagtacc gacgaaaaat 

40381 aaacgattat aatctttatc ttgataaatt tgaagaacag tttagtcaaa agtttaaaat 

40441 aaacagaact tcgataaaag aaagaattat tattaattta aacaagagga gatttaaatg 

40501 atgtggatta ctatgactat tgtatttgct atattgctat tagtttgtat cagtattaat 

40561 agtgatcgtg caagagagat acaagcactt agatatatga atgattatct acttgatgaa 

40621 gtagttaaaa ctaaagggta caacgggtta gaagaataca ggattgaatt gaagcgaatg 

40681 aataacgata ttaaaaagta atttatatta tcggaggtat tgcattgaat gataaagatt 

40741 gagaaacacg atatcaaaaa gcttgaagaa tacattcagc acatcgataa ctatcgaaga 

40801 gagttgaaga tgcgagaata tgaattactt gaaagtcatg aaccagataa tgcgggagct 

40861 ggcaaaagta atttgccggg taacccgatt gaacgatgtg caataaagaa gtttagtgat 

40921 aacaggtaca atacattaag aaatatagtt aacggtgtag atagattgat aggtgaaagt 

40981 gatgaggata cgcttgagtt attaaggttt agatattggg attgtcctat tggttgttat 

41041 gaatgggaag atatagcaca ttactttggt acaagtaaga caagtatatt acgtagaagg 

41101 aatgcactga tcgataagtt agcaaagtat attggttatg tgtagcggac ttttacccta 

41161 tgtaagtccg cattaaaaca gtttattatg ttagtatcag attaatattt aaagttatta 

41221 aatgctaata cgacgcatga acaagaggcg catcactatg tgatgtgtct ttttatttat 

41281 gaggtatgaa catgttcaaa ctaattgtaa atacattact acacatcaag tatagatgag 

41341 tcttgatact acttaagtta tataaggtga aacattatga tgactaaaga cgaacgtata 

41401 cgattctata agtctaaaga atggcaaata acaagaaaaa gagtgctaga aagagataat 

41461 tatgaatgtc aacaatgtaa gagagacggc aagttaacga catatgacaa aagcaagcgt 

41521 aagtcgttgg atgtagatca tatattatcg ctagaacatc atccggagtt tgctcatgac 

41581 ttaaacaatt tagaaacact gtgtattaaa tgtcacaaca aaaaagaaaa gagatttata 

41641 aaaaaagaaa ataaatggaa agacgaaaaa tggtaaatac ccccgggtca aaaaaatcaa 

41701 aagcgatc 
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Name 


Position 


1 


77ORF005 


1 9S72..2 1026 


2 


77ORF006 


3976..5196 


3 


77ORF007 


2 1871. .23076 


4 


77ORF008 


2120..3307 


5 


77ORF009 


31946..32803 


6 


77ORF010 


26092..26889 


7 


77ORF0H 


24441. .25208 


8 


77ORF012 


AA>*ftA A ATI/" 

29788..30576 


9 


77ORF013 


33620..34399 


10 


77ORF014 


27760..28512 


11 


77ORF015 


3291. .4028 


12 


77ORF016 


32867..33610 


13 


77ORF017 


23269..239S2 


14 


77ORF018 


31169..31840 


15 


77ORF019 


39851. .40501 


16 


77ORF020 


6926..7570 


17 


77ORF021 


37762..38304 


18 


77ORF022 


30605..31156 


19 


77ORF023 


26903..27346 


20 


77ORF024 


10700..11140 


21 


77ORF025 


9707.. 10147 


22 


77ORF026 


40729..41145 


23 


77ORF027 


6518..6925 


24 


77ORF028 


34795..35199 


25 


77ORF029 


6117..6521 


26 


77ORF030 


36478.-36879 


27 


77ORF031 


39151. .39546 


28 


77ORF032 


33892..34266 


29 


77ORF033 


5758..6120 


30 


77ORF034 


7886..8236 


31 


77ORF035 


19258..19560 


32 


77ORF036 


36876.-37223 


33 


77ORF037 


102..446 


34 


77ORF038 


34908..35219 


35 


77ORF039 


37220..37528 


36 


77ORF040 


41377..41676 


37 


77ORF041 


35454..3S753 


38 


77ORF042 


5490..5774 


39 


77ORF043 


29304..29564 


40 


77ORF044 


18481. .18768 


41 


77ORF045 


5216..5500 


42 


77ORF046 


25663.-25935 


43 


77ORF047 


11 159.. 11425 


44 


77ORF048 


28776..29039 


45 


77ORF049 


36013..36255 


46 


77ORF050 


35753..36007 


47 


77ORF051 


38931..39167 
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rosraoo 


45 


/ /UKrUOZ 




ACi 

49 


T*7r\T>TTAC5 




50 


7/UKrU54 




51 


*7*7/^\T> IT AC C 

77OKr055 


17j40..1 / /oo 


52 


77URrU5o 


18oy2..1Vl22 


53 


7/UKr05y 


li4*7CC 


54 


77OKr064 


OOCT/I *>07AC 

295 74.. 29795 


55 




2oj2o..2o/4o 


56 


T7AD T7A C. a 


2/4y4..z//03 


57 


77UKrUo9 


Jo34I..Joj4/ 


CO 

58 


77OKF070 


362o9..io475 


59 


77OKJr071 


ylAilAO >1A*7A1 

40498.. 4U701 


60 


77ORF072 


3o735..359j5 


61 


77ORF073 


AAA/ c ^ 1 1 A O 

30945..31148 


62 


77ORF074 


38544..38738 


63 


nAn f-»/\M ^ 

77ORF075 


1 ^ /"*7^ 1 O OTA 

13673..13870 


64 


77ORF077 


IdCT Af /Af 

25357..25605 


65 


77ORF079 


AAAOn AAAAA 

29089..29280 


66 


77ORF080 


C s\f\ A 1MOA 

35204..35389 


67 


77ORF085 


24060.. 24242 


68 


77ORF092 


lA^n/" n AO*T^ 

39706..39876 


69 


MM/^-W-» pA/\ J 

77ORF094 


32226.32393 


70 


77ORF096 


1 1 /A 1AWA 

13606.. 13773 


71 


TTf/^T» T^A AO 

77ORF098 


7092..7256 


72 


77ORF102 


29051. .29212 


73 


77ORF104 


A iAAA A if f f 

343 93. .34551 


74 


77ORF109 


18282.. 18434 


75 


770RF112 


39543. .39692 


76 


770RFH7 


27361. .27501 


*7*7 

77 


770RF118 


*01Aft lOCIA 


TO 

78 


77OKrl20 


36059.. 36199 


79 


770RF124 


33699..33S33 


OA 

80 


770RF128 


1 aIO 1 1 >I1CC 

14221. .14355 


O 1 

81 


77ORF130 


15675.. 15806 


82 


770RF133 


Oil 1 if OCVlO 

8414. .8542 


83 


77ORF140 


11111 1 111c 
13113..13235 


O A 

84 


770RF147 


7029..7148 


85 


770RF149 


30668..30787 


86 


770RF151 


31837..31953 


87 


770RF155 


30278..30391 


88 


770RF157 


4044..4157 


89 


770RF167 


20692..20799 


90 


770RF175 


35717..35821 


91 


770RF176 


6836..6940 


92 


770RF178 


35390..35491 


93 


770RF179 


8318..8419 


94 


770RF182 


29268.-29564 
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Table 4 



77ORJF017 sequence 

23982 atgacgcataatatagaaaaacgcattaataaattaaaaacttct 

1 MTHNIEKRINKLKTS 

23937 ggaaatccaaaatttaaaaagttagattcagatattcactattta 

16 GNPKFKKLDSDI HYL 

23892 ctcaagagatttgaaggtgaaaaaaaccataaaggtttttatcca 

31 LKRFE GEKNHKGFYP 

23847 aagtttaaacaaggagaaatagtttttgtagatttcggtataaac 

46 KFKQGEIVFVDFGIN 

23802 gttaataaagaattttctaattcacactttgcaatagtgatgaat 

61 VNKEFSNSHFAIVMN 

23757 aaaaatgattctaatacggaggatatagtaaatgttattccctta 

76 KNDSNTEDIVNV I PL 

23712 tcctctaaagaaaacaaaaagtatttaaagatgaattttgatttg 

91 SSKENKKYLKMNFDL 

23667 aaatgggagtattatttaagattgtttttaaatttaattagcgcg 

106 KWEYYLRLFLNLISA 

23622 caaaataattcagctatattaaaagaagttttcgataaaaaatac 

121 QNNSAI LKEVFDKKY 

23577 caaaaaaacaacacagaattcatcactaaagattattttattgaa 

136 QKNNTEFITKDYFIE 

23532 tttatatctgatagtttagaaattgaaaataaattaaataaaatt 

151 FISDSLEIENKLNKI 

234 87 gacagaaacattaataacatagtatcagcaattgataaggtaaaa 

166 DRNI NNIVSAIDKVK 

23442 aaattaaaaggtaatagttacgcttgcataaattctttccagccg 

181 KLKGNSYACINS FQP 

233 97 attagtaagtttcgcataagaaaagttttaccccaaaaaattaaa 

196 ISKFRIRKVLPQKIK 

233 52 aatccagtaatagattcttcggatattatgttactgataaataga 

211 NPVIDSSDIMLLINR 

23307 attaataataatatattgcagatccctgatataagatga 23269 

226 INNNILQIPDIR* 
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Physico-chemical parameters of ORF 77ORF017 

1 MTHNIEKRIN KLKTSGNPKF KKLDSDIHYL LKRFEGEKNH KGFYPKFKQG EIVFVDFGIN 

61 VNKEFSNSHF AIVMNKNDSN TEDIVNVIPL SSKENKKYLK MNFDLKWEYY LRLFLNLISA 

121 QNNSAILKEV FDKKYQKNNT EFITKDYFIE FISDSLEIEN KLNKIDRNIN NIVSAIDKVK 

181 KLKGNSYACI NSFQPISKFR IRKVLPQKIK NPVIDSSDIM LLINRINNNI LQIPDIR 



Number of amino acids: 237 

Average molecular weight (Daltons): 27887.38 

Mean annimo acid weight (Batons): 1 17.67 

Monoisotopic molecular weight (Daltons): 27869.83 

Mean amino acid moonoisotopic weight (Daltons): 1 17.59 

Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


5 


2.11% 


7.58% 


Cys 


C 


1 


0.42% 


1.66% 


Asp 


D 


14 


5.91% 


5.28% 


Glu 


E 


13 


5.49% 


6.37% 


Phe 


F 


16 


6.75% 


4.09% 


Gly 


G 


6 | 


2.53% 


6.84% 


His 


H 


4 


1.69% 


2.24% 


lie 


I 


29 


12.24 
% 


5.81% 


Lys 


K 


33 


13.92 
% 


5.95% 


Leu 


L 


19 


8.02% 


9.42% 


Met 


M 


4 


1.69% 


2.37% 


Asn 


N 


30 


12.66 
% 


4.45% 


Pro 


P 


7 


2.95% 


4.9% 


Gin 


Q 


6 


2.53% 


3.97% 


Arg 


R 


8 


3.38% 


5.16% 


Ser 


s 


17 


7.17% 


7.12% 


Thr 


T 


5 


2.11% 


5.67% 


Val 


V 


11 


4.64% 


6.58% 


Trp 


W 


1 


0.42% 


1.23% 


Tyr 


Y 


8 


3.38% 


3.18% 



Number off acidic (negative) amino acids (EB): 


27 




11.39% 


Number off bask (positive) amino acids (KR): 


41 


17.30% 


Total charge (KREB): 


68 




28.69% 


Net charge (KR-EB): 


14 




5.91% 


Theoritical pi: 


10.01 


Total linear charge density: 


0.30 


Average hydrophobicity: 


-5.37 


Ratio of hydrophilicity to hydrophobicity: 


1.41 


Percentage of hydrophilic amino acid: 


57.81% 


Percentage of hydrophobic amino acid: 


42.19%" 


Ratio of %hydrophi!ic to %hydrophobic: 


1.37 
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77ORF01 9 sequence 

39851 atgaacgagcaaataataggaagcatatatactttagcaggaggt 

1 MNEQIIGSIYTLAGG 

39896 gttgtgctttattcagttaaagagatttttaggtattttacagat 

16 VVLYSVKEIFRYFTD 

39941 tctaacttacaacgtaaaaaaatcaatttagaacaaatatatccg 

31 SNLQ RKKINLEQIYP 

39986 atatatttagattgttttaaaaaggctaaaaagatgattggagct 

46 IYLDCFKKAKKMIGA 

40031 tatattattccaacagaacagcatgaatttttagatttttttgat 

61 YI I PTEQHEFLDFFD 

40076 attgaagtctttaataatttagataagcaaagtaaaaaagcgtat 

76 IEVFNNLDKQSKKAY 

40121 gaaaatgttattggatttagacaaatgattaatttatcaaataga 

91 ENVIGFRQMINLSNR 

40166 gttaaggcaatggaagattttaagatgagtttcaacaatgaattt 

106 VKAMEDFKMS FNNEF 

40211 agtacaaatcagattttttttaatccttcttttgttatggaaaca 

121 STNQIFFNPS FVMET 

40256 attgctattataaatgaatatcaaaaagatatatcttatttaaaa 

136 IAI INEYQKDISYLK 

40301 aatataattaataaaatgaatgaaaatagagcttataatcatatt 

151 NI INKMNENRAYNHI 

40346 gatagttttatcacttcagagtaccgacgaaaaataaacgattat 

166 DSFITSEYRRKINDY 

40391 aatctttatcttgataaatttgaagaacagtttagtcaaaagttt 

181 NLYLDKFEEQFSQKF 

40436 aaaataaacagaacttcgataaaagaaagaattattattaattta 

196 KINRTSI KERI I INL 

40481 aacaagaggagatttaaatga 40501 

211 N K R R F K * 
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Physico-chemical parameters of ORF 77ORF019 

1 MNEQIIGSIY TLAGGWLYS VKEIFRYFTD SNLQRKKINL EQIYPIYLDC FKKAKKMIGA 

61 YIIPTEQHEF LDFFDIEVFN NLDKQSKKAY ENVIGFRQMI NLSNRVKAME DFKMSFNNEF 

121 STNQIFFNPS FVMETIAIIN EYQKDISYLK NIINKMNENR AYNHIDSFIT SEYRRKINDY 

181 NLYLDKFEEQ FSQKFKINRT SIKERIIINL NKRRFK 



Number of amino acids: 216 

Average molecular weight (Daltons): 26026.06 

Mean amino acid weight (Daltons): 120.49 

Monoisotopic molecular weight (Daltons): 26009.34 

Mean amino acid monoisotopic weight (Daltons): 120.41 

Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


7 


3.24% 


7.58% 


Cys 


C 


1 


0.46% 


1.66% 


Asp 


D 


10 


4.63% 


5.28% 


Glu 


E 


16 


7.41% 


6.37% 


Phe 


F 


19 


8.80% 


4.09% 


Gly 


G 


5 


2.31% 


6.84% 


His 


H 


2 


0.93% 


2.24% 


lie 


I 


28 


12.96 
% 


5.81% 


Lys 


K 


22 


10.19 

% 


5.95% 


Leu 


L 


12 


5.56% 


9.42% 


Met 


M 


7 


3.24% 


2.37% 


Asn 


N 


23 


10.65 
% 


4.45% 


Pro 


P 


3 


1.39% 


4.9% 


Gin 


Q 


10 


4.63% 


3.97% 


Arg 


R 


11 


5.09% 


5.16% 


Ser 


s 


13 


6.02% 


7.12% 


Thr 


T 


7 


3.24% 


5.67% 


Val 


V 


7 


3.24% 


6.58% 


Trp 


W 


0 


0.00% 


1.23% 


Tyr 


Y 


13 


6.02% 


3.18% 



Number of acidic (negative) amino acids (ED): 


26 


12.04% 


Number of basic (positive) amino acids (KR): 


33 


15.28% 


Total charge (KRED): 


59 




27.31% 


Net charge (KR - ED): 


7 




3.24% 


Theoritical pi: 


9.52 


Total linear charge density: 


0.28 


Average hydrophobicity: 


-4.84 


Ratio of hydrophilicity to hydrophobicity: 


1.37 


Percentage of hydrophilic amino acid: 


54.17% 


Percentage of hydrophobic amino acid: 


45.83% 1- 


Ratio of %hydrophilic to %hydrophobic: 


1.18 - 
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77ORF043 sequence 

293 04 atgtattacgaaataggcgaaatcatacgcaaaaatattcatgtt 

1 MYYEIGEIIRKNIHV 
29349 aacggattcgattttaagctattcattttaaaaggtcatatgggc 
16 NGFDFKLFILKGHMG 

2 93 94 atatcaat acaagt t aaagatatgaacaacgtaccaattaaacat 
31 ISI QVKDMNNVP I KH 
29439 gcttatgtcgtagatgagaatgacttagatatggcatcagactta 
46 AYV.VDENDLDMASDL 
29484 tttaaccaagcaatagatgaatggattgaagagaacacagacgaa 
61 FNQAIDEWIEENTDE 
29529 caggacagactaattaacttagtcatgaaatggtag 29564 

76 QDRL INLVMKW * 
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Physico-chemical parameters of ORF 77ORF043 

1 MYYEIGEIIR KNIHVNGFDF KLFILKGHMG ISIQVKDMNN VPIKHAYWD ENDLDMASDL 

61 FNQAIDEWIE ENTDEQDRLI NLVMKW 



Number of amino acids: 86 

Average molecular weight (Daltons): 10186.68 

Mean amino acid weight (Daltons): 1 1 8.45 

Monoisotopic molecular weight (Daltons): 10180.02 

Mean amino acid monoisotopic weight (Daltons): 1 18.37 

Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


3 


3.49% 


7.58% 


Cys 


C 


0 


0.00% 


1.66% 


Asp 


D 


9 


10.47 

% 


5.28% 


Glu 


E 


7 


8.14% 


6.37% 


Phe 


F 


4 


4.65% 


4.09% 


Gly 


G 


4 


4.65% 


6.84% 


His 


H 


3 


3.49% 


2.24% 


He 


I 


11 


12.79 
% 


5.81% 


Lys 


K 


6 


6.98% 


5.95% 


Leu 


L 


6 


6.98% 


9.42% 


Met 


M 


5 


5.81% 


2.37% 


Asn 


N 


8 


9.30% 


4.45% 


Pro 


P 


1 


1.16% 


4.9% 


Gin 


Q 


3 


3.49% 


3.97% 


Arg 


R 


2 


2.33% 


5.16% 


Ser 


s 


2 


2.33% 


7.12% 


Thr 


T 


1 


1.16% 


5.67% 


Val 


V 


6 


6.98% 


6.58% 


Trp 


W 


2 


2.33% 


1.23% 


Tyr 


Y 


3 


3.49% 


3.18% 



Number of acidic (negative) amino acids (ED): 


16 


18.60% 


Number of basic (positive) amino acids (KR): 


8 


9.30% 


Total charge (KRED): 


24 




27.91% 


Net charge (KR- ED): 


-8 


9.30% 




Theoritical pi: 


4.38 


Total linear charge density: 


0.30 


Average hydrophobicity: 


-2.80 


Ratio of hydrophilicity to hydrophobicity: 


1.19 


Percentage of hydrophilic amino acid: 


48.84% 


Percentage of hydrophobic amino acid: 


51.16% 


Ratio of %hydrophilic to %hydrophobic: 


0.95 
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77ORF102 sequence 



29051 atgagcaacatttataaaagctacctagtagcagtattatgcttc 

1 MSNI YKSYLVAVLCF 

2 9096 acagtcttagcgattgtacttatgccgtttctatacttcactaca 
16 TVLAIVLMPFLYFTT 

29141 gcatggtcaattgcgggattcgcaagtatcgcaacattcatgtac 
31 AWS IAGFASIATFMY 
29186 tacaaagaatgctttttcaaagaataa 29212 

46 YKECFFKE* 
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Physico-chemical parameters of ORF 77ORF102 

1 MSNIYKSYLV AVLCFTVLAI VLMPFLYFTT AWSIAGFASI ATFMYYKECF FKE 



Number of amino acids: 53 

Average molecular weight (Dal tons): 6 1 55 .42 

Mean amino acid weight (Daltons): 1 1 6. 14 

Monoisotopic molecular weight (Daltons): 6 1 5 1 .07 

Mean amino acid monoisotopic weight (Daltons): 1 1 6.06 

Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % \ 
in Swissprot 


Ala 


A 


6 


11.32 
% 


7.58% 


Cys 


C 


2 


3.77 
% 


1.66% ; 


Asp 


D 


0 


0.00% 


5.28% 


Glu 


E 


2 


3.77 
% 


6.37% 


Phe 


F 


7 


13.21 

% 


4.09% 


Gly 


G 


1 


1.89 

% 


6.84% 


His 


H 


0 


0.00% 


2.24% 


He 


I 


4 


7.55 
% 


5.81% 


Lys 


K 


3 


5.66% 


5.95% 


Leu 


L 


5 


9.43 
% 


9.42% 


Met 


M 


3 


5.66% 


2.37% 


Asn 


N 


1 


1.89 

% 


4.45% 


Pro 


P 


1 


1.89% 


4.9% 


Gin 


Q 


0 


0.00 
% 


3.97% 


Arg 


R 


0 


0.00% 


5.16% 


Ser 


s 


4 


7.55 
% 


7.12% 


Thr 


T 


4 


7.55% 


5.67% 


Val 


V 


4 


7.55 
% 


6.58% 


Trp 


W 


1 


1.89% 


1.23% 


Tyr 


Y 


5 


9.43 
% 


3.18% 



Number of acidic (negative) amino acids (ED): 


2 


3.77% 


Number of basic (positive) amino acids (KR): 


3 


5.66% 


Total charge (KRED): 


5 




9.43% 


Net charge (KR- ED): 


1 




1.89% 


Theoritical pi: 


8.18 


Total linear charge density: 


0.13 , 


Average hydrophobicity: 


10.8f 


Ratio of hydrophilicity to hydrophobicity: 


0.40 


Percentage of hydrophilic amino acid: 


28.30% 


Percentage of hydrophobic amino acid: 


71.70% 
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Ratio of %hydrophilic to %hydrophobic: 0.39 
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77ORF104 sequence 



34393 




atggtaaccaaagaatttttaaaaactaaacttgagtgttcagat 


1 M 


V 


TKEFLKTKLECSD 


34438 




atgtacgctcagaaactcatagatgaggcacagggcgatgaaaat 


16 M 


Y 


AQ KLIDEAQGDEN 


34483 




aggttgtacgacctatttatccaaaaacttgcagaacgtcataca 


31 R 


L 


YDLFIQKLAERHT 


34528 




cgccccgctatcgtcgaatattaa 34551 


46 R 


P 


A I V E Y * 
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Physico-chemical parameters of ORF 77ORF104 

1 MVTKEFLKTK LECSDMYAQK LIDEAQGDEN RLYDLFIQKL AERHTRPAIV EY 



Number of amino acids: 

Average molecular weight (Daltons): 

Mean amino acid weight (Daltons): 

Monoisotopic molecular weight (Daltons): 

Mean amino acid monoisotopic weight (Daltons): 

Amino acid composition 



52 

6193.13 
119.10 
6189.12 
119.02 



Aci 
d 


Syrabo 
I 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


4 


7.69 
% 


7.58% 


Cys 


C 


1 


1.92% 


1.66% 


Asp 


D 


4 


7.69 
% 


5.28% 


Glu 


E 


6 


11.54 

% 


6.37% 


Phe 


F 


2 


3.85 
% 


4.09% 


Gly 


G 


1 


1.92% 


6.84% 


His 


H 


1 


1.92 
% 


2.24% 


He 


I 


3 


5.77% 


5.81% 


Lys 


K 


5 


9.62 
% 


5.95% 


Leu 


L 


6 


11.54 

% 


9.42% 


Met 


M 


2 


3.85 
% 


2.37% 


Asn 


N 


1 


1.92% 


4.45% 


Pro 


P 


1 


1.92 
% 


4.9% 


Gin 


Q 


3 


5.77% 


3.97% 


Arg 


R 


3 


5.77 
% 


5.16% 


Ser 


s 


1 


1.92% 


7.12% 


Thr 


T 


3 


5.77 
% 


5.67% 


Val 


V 


2 


3.85% 


6.58% 


Tip 


W 


0 


0.00 
% 


1.23% 


Tyr 


Y 


3 


5.77% 


3.18% 



Number of acidic (negative) amino acids (ED): 
Number of basic (positive) amino acids (KR): 
Total charge (KRED): 

Net charge (KR- ED): 

3.85% 

Theoritical pi: 

Total linear charge density: 

Average hydrophobicity: 

Ratio of hydrophilicity to hydrophobicity: 

Percentage of hydrophilic amino acid: 

Percentage of hydrophobic amino acid: 



10 

19.23% 
8 

15.38% 
18 

34.62% 
-2 

5.03 

0.38 

-5.81 "* 

1.47 

53.85% 

46.15% 
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Ratio of %hydrophiIic to %hydrophobic: 1.17 
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770RF182 sequence 

29268 atgttcaatataaaacgaaaaacggaggaagtcaagatgtattac 

1 MFNI KRKTEEVKMYY 

29313 gaaataggcgaaatcatacgcaaaaatattcatgttaacggattc 

16 EIGEIIRKNIHVNGF 

29358 gattttaagctattcattttaaaaggtcatatgggcatatcaata 

31 DFKLFILKGHMGISI 

29403 caagttaaagatatgaacaacgtaccaattaaacatgcttatgtc 

46 QVKDMNNVPI KHAYV 

29448 gtagatgagaatgacttagatatggcatcagacttatttaaccaa 

61 VDENDLDMASDLFNQ 

29493 gcaatagatgaatggattgaagagaacacagacgaacaggacaga 

.76 AIDEWIEENTDEQDR 

29538 ctaattaacttagtcatgaaatggtag 29564 

91 LINLVMKW* 
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Physico-chemical parameters of ORF 770RF182 

1 MFNIKRKTEE VKMYYEIGEI IRKNIHVNGF DFKLFILKGH MGISIQVKDM NNVPIKHAYV 

61 VDENDLDMAS DLFNQAIDEW IEENTDEQDR LINLVMKW 



Number of amino acids : 98 

Average molecular weight (Daltons): 1 1691.50 

Mean amino acid weight (Daltons): 1 1 9.30 

Monoisotopic molecular weight (Daltons): 1 1683.84 

Mean amino acid monoisotopic weight (Daltons): 1 19.22 



Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


3 


3.06 
% 


7.58% 


Cys 


C 


0 


0.00% 


1.66% 


Asp 


D 


9 


9.18 ; 
% 


5.28% 


Glu 


E 


9 


9.18% 


6.37% 


Phe 


F 


5 


5.10 
% 


4.09% 


Gly 


G 


4 


4.08% 


6.84% 


His 


H 


3 


3.06 : 

% 


2.24% 


He 


I 


12 


12.24 
% 


5.81% 


Lys 


K 


9 


9.18 

% 


5.95% 


Leu 


L 


6 


6.12% 


9.42% 


Met 


M 


6 


6.12 

% 


2.37% 


Asn 


N 


9 


9.18% 


4.45% 


Pro 


P 


1 


1.02 
% 


4.9% 


Gin 


Q 


3 


3.06% 


3.97% 


Arg 


R 


3 


3.06 
% 


5.16% 


Ser 


s 


2 


2.04% 


7.12% 


Thr 


T 


2 


2.04 

% 


5.67% 


Val 


V 


7 


7.14% 


6.58% 


Trp 


W 


2 


2.04 

% 


1.23% 


Tyr 


Y 


3 


3.06% 


3.18% 



Number of acidic (negative) amino acids (ED): 


18 


18.37% 


Number of basic (positive) amino acids (KR): 


12 


12.24% 


Total charge (KRED): 


30 




30.61% 


Net charge (KR- ED): 


-6 


6.12% 




Theoritical pi: 


4.76 - 


Total linear charge density: 


0.33 


Average hydrophobicity: 


-3.89 


Ratio of hydropbilicity to hydrophobicity: 


1.28 
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Percentage of hydrophilic amino acid: 5 1 .02% 

Percentage of hydrophobic amino acid: 48.98% 

Ratio of %hydrophilic to %hydrophobic: 1 04 
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Table 5 



B LAS TP 2.0.8 [Jan-05-1999] 

Query= 9id| 100017 | lan | 77ORF017 Phage 77 ORF | 23269-23982 | -3 
{237 letters) 



Database : 



nr 

393,678 sequences; 



120,452,765 total letters 



Sequences producing significant alignments: 

gi|4493986|emb|CAB39045.l| (AL034559) predicted using hexExon; . 
gi| 730607 |sp|P23250|RPIl_YEAST NEGATIVE RAS PROTEIN REGULATOR P. 
gi)3097044|emb|CAA75299| (Y15035) KIR (Cowpox virus] 
gi|2146245|pir| (S73794 hypothetical protein H91_orfl80 - Mycopl . 
gi|83910|pir | |S04682 ribosomal protein varl - yeast (Candida gl. 
gi| 133135 |sp|P21358|RMAR_CANGA MITOCHONDRIAL RIBOSOMAL PROTEIN . 
gi|2128843 pir| |H64475 hypothetical protein MJ1409 - Methanococ. 
gi|5107017 gb| AAD39926 . 1 | AF126285_2 (AF126285) RNA polymerase [. 
gi|2146210 pir| |S73342 hypothetical protein E07_orfl66 - Mycopl. 

Database: swissprot 

79,449 sequences; 28,874,452 total letters 



Score 


E 


(bits) 


Value 


41 


0.010 


38 


0.053 


38 


0.090 


38 


0.090 


37 


0.15 


37 


0.15 


36 


0.20 


36 


0.35 


35 


0.60 



Sequences producing significant alignments: 



Score E 
(bits) Value 



sp|P23250 RPI1_YEAST NEGATIVE RAS PROTEIN REGULATOR PROTEIN. 

Sp|P21358 RMAR_CANGA MITOCHONDRIAL RIBOSOMAL PROTEIN VAR1 . 

sp|Q21444 LDLC_CAEEL LDLC PROTEIN HOMOLOG. 

sp|P27240 RFAY_ECOLI LI PO POLYSACCHARIDE CORE BIOSYNTHESIS PROT. 

sp|P53192 YGC0_YEAST HYPOTHETICAL 27.1 KD PROTEIN IN ALK1-CKB1 . 

sp|P32908 SMC1_YEAST CHROMOSOME SEGREGATION PROTEIN SMC1 (DA-B . 

sp|P54683 TAGB_ DICDI PRESTALK- SPECIFIC PROTEIN TAGB PRECURSOR . 

sp|Q03100 CYAA_DICDI ADENYLATE CYCLASE, AGGREGATION SPECIFIC (. 



38 


0.014 


37 


0.040 


34 


0.35 


33 


0.46 


33 


0.60 


33 


0.60 


32 


0.78 


32 


0.78 
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BLASTP 2.0.8 ( Jan-05-1999] 



Query= sid| 100019 | lan | 77ORF019 Phage 77 ORF| 39851-40501 | 2 
(216 letters) 

Database: nr 

373,355 sequences; 114,214,446 total letters 



Sequences producing significant alignments: 



Score E 
(bits) Value 



gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 



3341966 |dbj|BAA31932 1 (AB009866) orf 59 [bacteriophage phi PVL] 
2689911 (AE000792) B. burgdorferi predicted coding region BB 
1171589|emb|CAA64574| (X95275) frameshift (Plasmodium falcip 
4493986|emb|CAB39045.l| (AL034559) predicted using hexExon; 
141257 |sp|P180l9|YPI9_CLOPE HYPOTHETICAL 14.5 KD PROTEIN (OR 
133412 |sp|P27059 |RPOB_ASTLO DNA- DIRECTED RNA POLYMERASE BETA 
3122231 |sp|Q5885l|HISX_METJA HISTIDINOL DEHYDROGENASE (HDH) 
3649757|emb|CAB11106.l| (Z98547) predicted using hexExon; MA 
2688313 (AE001146) sensory transduction histidine kinase, pu 



437 e-122 
38 0.058 
0.10 



37 
36 
36 
35 
35 
34 
34 



0.23 
0.29 
0.51 
0.51 
0.66 
0.87 



Database: swissprot 

79,449 sequences; 28,874,452 total letters 



Sequences producing significant alignments: 

sp|P18019 YPI9_CLOPE HYPOTHETICAL 14.5 KD PROTEIN (ORF9) . 

sp|Q58851 HISX_METJA HISTIDINOL DEHYDROGENASE (EC 1.1.1.23) (H. 

Sp|P27059 RPOB_ASTLO DNA- DIRECTED RNA POLYMERASE BETA CHAIN (E . 

sp|Q02224 CENE_HUMAN CENTROMERIC PROTEIN E (CENP-E PROTEIN). 

sp|P04931 ARP_PLAFA ASPARAGINE-RICH PROTEIN (AG319) (ARP) (FRA. 

sp|P18011 IPAB_SHIFL 62 KD MEMBRANE ANTIGEN. 

sp|P18709 VTA2_XENLA VITELLOGENIN A2 PRECURSOR (VTG A2) [CONTA. 

sp|Q64409 CP3H_CAVPO CYTOCHROME P450 3A17 (EC 1.14.14.1) (CYPI. 

sp|P21358 RMAR__CANGA MITOCHONDRIAL RIBOSOMAL PROTEIN VAR1 . 

sp|Q03945 IPAB_SHIDY 62 KD MEMBRANE ANTIGEN. 



Score E 
(bits) Value 



36 
35 
35 
34 
33 
32 
32 
32 
32 
32 



0.079 

0.14 

0.14 

0.31 

0.53 

0.69 

0.90 

0.90 

0.90 

1.2 
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BLASTP 2.0.8 [Jan-05-1999] 

Query= sid|l00043|lan|77ORF043 Phage 77 ORF| 29304-29564 | 3 
(86 letters) 

Database: nr 

373,355 sequences; 114,214,446 total letters 



Sequences producing significant alignments: 



Score E 
(bits) Value 



gi|3341947|dbj|BAA31913| (AB009866) orf 39 [bacteriophage phi PVL] 182 

gi|744518|prf j 2 014 4 22A FKBP-rapamycin- associated protein (Homo. . . 32 

gij 1169736 |sp P42346 | FRAP_RAT FKBP-RAPAMYCIN ASSOCIATED PROTEIN... 32 

gi|H69735|sp P4234S | FRAP_HUMAN FKBP-RAPAMYCIN ASSOCIATED PROTE . . . 32 

gi (3282239 (U88966) rapamycin associated protein FRAP2 [Homo sa... 32 

gij 3875402 |emb|CAA98122| (Z73906) cDNA EST EMBL:D64544 comes fr. . . 31 

gi |1084792|pir| |S54091 hypothetical protein YPR070w - yeast (Sa... 30 



6e-46 

0.84 

0.84 

0.84 

0.84 

2.5 

4.2 



Database: 



swissprot 
79,449 sequences; 



28,874,452 total letters 



Sequences producing significant alignments: 



FRAP_HUMAN FKBP-RAPAMYCIN ASSOCIATED PROTEIN (FRAP) . 
FRAP_RAT FKBP-RAPAMYCIN ASSOCIATED PROTEIN (FRAP) (R. 
YNP1 CAEEL HYPOTHETICAL 42.2 KD PROTEIN T05G5.1 IN C. 
LINOTTE PROTEIN. 
ANTICHYMOTRYPSIN II (ACHY- II) . 
ANTITRYPSIN PRECURSOR (AT) . 
CONJUGAL TRANSFER PROTEIN TRAA. 
HYPOTHETICAL 51.3 KD PROTEIN IN PH05-VPS1. 
SH3BGR PROTEIN (21-GLUTAMIC ACID-RICH PRO. 
HYPOTHETICAL PROTEIN MJ1082 . 
HYPOTHETICAL 52.3 KD PROTEIN IN HAP4-AAT1. 



Sequences 


sp 


P42345 


sp 


P42346 


sp 


P34554 


sp 


Q24118 


sp 


P80034 


sp 


P22922 


sp 


Q44363 


sp 


P38255 


sp 


P55822 


sp 


058482 


sp 


P34252 



LIO_DROME 

ACH2_BOMMO 

AlAT_BOMMO 

TRAA_AGRT6 

YBU5_YEAST 

SH3B_HUMAN 

YA82_METJA 

YKK8 YEAST 



Score 


E 


(bits) 


Value 


32 


0.24 


32 


0.24 


28 


3.5 


28 


3.5 


28 


3.5 


28 


3.5 


28 


3.5 


27 


6.0 


27 


7.9 


27 


7.9 


27 


7.9 
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BLASTP 2.0.8 (Jan- 05 - 1999] 

Query= sid|l00102|lan|77ORF102 Phage 77 ORF| 29051-29212 | 2 
(53 letters) 

Database: nr 

373,355 sequences; 114,214,446 total letters 



Score E 

Sequences producing significant alignments: (bits) Value 

gi 1 3341946 |dbj |BAA31912| (AB009866) orf 38 [bacteriophage phi PVL] 96 3e-20 

gi|4325288|gb|AAD17315| (AF123593) voltage -dependent sodium cha . 28 7.1 

gi|2649684 (AE001040) A. fulgidus predicted coding region AF092 .. . 28 9.3 

Database: swissprot 

79,449 sequences; 28,874,452 total letters 

Score E 

Sequences producing significant alignments: (bits) Value 

sp|P42087 HUTM_BACSU PUTATIVE HISTIDINE PERMEASE. 26 7.1 

sp|P04775 CIN2~RAT SODIUM CHANNEL PROTEIN, BRAIN II ALPHA SUBU. . . 26 9.2 

sp|P42619 YQJF_ECOLI HYPOTHETICAL 17 . 2 KD PROTEIN IN EXUR-TDCC. . . 26 9.2 
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BIASTP 2.0,8 [Jan-05-1999) 

Query= sid| 100104 | lan | 77ORF104 Phage 77 ORF | 34393-34551 | 1 
(52 letters) 

Database: nr 

373,355 sequences; 114,214,446 total letters 

Score E 

Sequences producing significant alignments: (bits) Value 

gi|2315S23 (AF016452) similar to the leucine-rich domains found... 29 4.2 
gi|4377168|gb|AAD18990| (AE001666) CT7 11 hypothetical protein [... 29 5.4 
gi 1 3882171 jdbj |BAA34445| (AB018268) KIAA0725 protein [Homo sapi... 28 9.3 



Database: swissprot 

79,449 sequences; 28,874,452 total letters 

Score E 

Sequences producing significant alignments: (bits) Value 

sp|P04879 RRPP_VSVIG RNA POLYMERASE ALPHA SUBUNIT (EC 2.7.7.48. 27 5.4 

sp|P04880 RRPP_VSVIM RNA POLYMERASE ALPHA SUBUNIT (EC 2.7.7.48. 27 5.4 

sp|Q13946 CN7A_HUMAN HIGH- AFFINITY CAMP-SPECIFIC 3 ',5 '-CYCLIC . 26 7.1 

sp|P35381 ATPA_DROME ATP SYNTHASE ALPHA CHAIN, MITOCHONDRIAL P. 26 9.3 

sp|P54659 MVPBJDICDI MAJOR VAULT PROTEIN BETA (MVP-BETA) . 26 9.3 

sp|P40397 YHXCJBACSU HYPOTHETICAL OXIDOREDUCTASE IN APRE-COMK . 26 9.3 



Patent provided by Sughrue Mion, PLLC - http://www.sughrue.com 
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BLASTP 2.0.8 ( Jan-05 - 1999 1 

Query* sid| 122748 | lan | 770RF182 Phage 77 ORF| 29268-29564 1 3 
(98 letters) 

Database: nr 

393,678 sequences; 120,452,765 total letters 

Sequences producing significant alignments: 

gi| 3341947 |dbj |BAA31913.l| (AB009866) orf 39 [bacteriophage phi. 
gi|1084792|pir| |S54091 hypothetical protein YPR070w - yeast (Sa. 
gi| 1169736 |sp|P42346|FRAP_RAT FKBP-RAPAMYCIN ASSOCIATED PROTEIN . 
gi|744518|prf | |2014422A FKBP-rapamycin-associated protein [Homo. 
gi|505138l|emb|CAB44736.l| (AL049653) dJ647M16.2 (FK506 binding. 
gi| 4826730 jref j NP_004949 . 1 | pFRAPl | FK506 binding protein 12-rap. 
gi | 3282239 {U88966) rapamycin associated protein FRAP2 [Homo sa. 

Database: swissprot 

79,909 sequences; 29,054,478 total letters 

Sequences producing significant alignments: 

sp|P42345 FRAP HUMAN FKBP-RAPAMYCIN ASSOCIATED PROTEIN (FRAP) . 

Sp|P42346 FRAPJIAT FKBP-RAPAMYCIN ASSOCIATED PROTEIN (FRAP) (R. 

sp|P40557 YIA5_YEAST PUTATIVE DISULFIDE ISOMERASE YIL005W PREC. 

sp|Q24118 LIO DROME LINOTTE PROTEIN. 

Sp|Q44363 TRAA_AGRT6 CONJUGAL TRANSFER PROTEIN TRAA. 

sp|P80034 ACH2_BOMMO ANTICHYMOTRYPSIN II (ACHY- II ) . 

sp|P34554 YNP1_CAEEL HYPOTHETICAL 42.2 KD PROTEIN T05G5.1 IN C. 

sp|P22922 AlAT_BOMMO ANTITRYPSIN PRECURSOR (AT). 



Score 


E 


(bits) 


Value 


. 182 


8e-46 


35 


0.13 


32 


1.1 


32 


1.1 


32 


1.1 


32 


1.1 


32 


1.1 


Score 


E 


(bits) 


Value 


32 


0.29 


32 


0.29 


29 


3.3 


28 


4.4 


28 


4.4 


28 


4.4 


28 


4.4 


28 


4.4 



WO 00/32825 



PCT/IB99/02040 



Table 6 



1st 










ora 


position 
(5' end) 


II 


2nd position 
C A 


G 


(3 end) 




Phe 


Ser 


Tyr 


Cys 


u 




Phe 


Ser 


Tyr 


Cys 


c 


y 


Leu 


Ser 


Stop 


Stop 


ft 

A 




Leu 


Ser 


Stpp 


Trp 


G 




Leu 


Pro 


His 


Arg 


U 




Leu 


Pro 


His 


Arg 


C 


c 


Leu 


Pro 


Gin 


Arg 


A 




Leu 


Pro 


Gin 


Ara 


G 




lie 


Thr 


Asn 


Ser 


U 




lie 


Thr 


Asn 


Ser 


C 


A 


He 


Thr 


Lys 


Arg 


A 




Met 


Thr 


Lvs 


Ara 


G 




Val 


Ala 


Asp 


Gly 


U 




Val 


Ala 


Asp 


Gly 


C 


G 


Val 


Ala 


Glu 


Gly 


A 




Val 


Ala 


Glu 


Glv 


G 
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Table 7 

Bacteriophage 3A, complete genome sequence 

1 caaacgctag caacgcggat aaatttttca tgaaaggggg tctttatatg aagttaacaa aaaaacagct 

71 aaaagaatat atagaagatt acaaaaaatc tgatgacata ttaattaatt tgtatataga aacatatgaa 

141 ttttattgtc ggttaagaga tgaacttaaa aatagtgatt taatgataga gcatacaaac aaggctggtg 

211 cgagcaatat tattaagaat ccattaagca tagaactgac aaaaacagtt caaacactaa ataacttact 

281 caagtctatg ggtttaactg cagcacaaag aaaaaagata gttcaagaag aaggtggatt cggtgactat 

351 taaagtttta aatgaacctt caccaaaact attaacaaca tggtatgcag agcaagtcac tcaagggaaa 

421 ataaaaacaa gcaaatatgt tagaaaagaa tgtgagagac atcttagata tctagaaaat ggaggtaaat 

491 gggtatttga tgaagaatta gcgcatcgtc ctattcgatt tatagaaaag ttttgtaaac cttccaaagg 

561 atctaaacgt caacttgtat tacagccatg gcaacatttt attatcggca gtttgtttgg ttgggttcat 

631 aaagaaacaa aactgcgcag gtttaaagaa gctttgatat ttatggggcg aaaaaatggt aaaacaacca 

701 ctatttctgg ggttgctaac tatgctgtat cacaagatgg agaaaatggt gcagaaattc atttgttagc 

771 aaacgtaatg aaacaagcta ggattctatt tgatgaatct aaggcgatga ttaaagctag cccaaagctt 

841 gataaaaatt tcagaacatt aagagatgaa atccattatg acgcaacgat atcaaaaatt atgccccaag 

911 catcagatag cgataagtta gatggattga atacacacat ggggattttt gatgaaattc atgaatttaa 

981 agactataaa ttgatttcag ttataaaaaa ctcaagagct gcaaggttac aacctcttct catctacatt 

1051 acgacagcag ggtatcaatt agatggtcca cttgttgata tggtagaagc gggaagagac accttagatc 

1121 aaatcataga agacgaaaga actttttatt atttagcatc tttggatgat gacgatgata ttaatgattc 

1191 gtcgaactgg ataaaagcaa atcccaactt aggtgtctct ataaatttag atgagatgaa agaagagtgg 

1261 gaaaaagcta agagaacacc agctgaacgt ggagatttta taaccaaaag gtttaatatc tttgctaata 

1331 atgacgagat gagttttatt gattacccaa cactccaaaa aaataatgaa attgtttctt tagaagagct 

1401 ggaaggcaga ccgtgcacga ttggttatga tttatcagaa acagaggact ttacagccgc gtgtgctact 

1471 tttgcgttag ataatggtaa agttgcagtt ttatcgcatt catggattcc taagcacaaa gttgaatatt 

1541 ctaacgaaaa aataccctat agagaatggg aagaagatgg cttattaaca gtgcaagata agccttatat 

1611 tgactaccaa gatgttttaa attggataat taagatgaat gagcattatg tagtagaaaa aattacttat 

1681 gatagagcga acgcattcaa actaaatcaa gagttaaaaa attacgggtt tgaaacggaa gaaacaagac 

1751 aaggagcttt gaccttgagc cctgcattga aggatttaaa agaaatgttt ttagatggga aaataatatt 

1821 taataataat cctttaatga aatggtatat caataatgtt cagttgaaac tagacagaaa cggaaactgg 

1891 ttgccgtcta agcaaagcag atatcgtaaa atagatggct ttgcagcatt tttaaacaca tatacagata 

1961 ttatgaataa agttgtttct gatagtggtg aaggaaacat agagtttatt agtattaaag acataatgcg 

2031 ttaaggaggt gaatgttatc gcaaaagaga atattgtcac acgcataaag aaaaaattga tagacaattg 

2101 gattgatcag tcaacttcta agctttatga ctttagccca tggaaaaata gatctttttg gggtgtaatt 

2171 aataatacgc ttgaaactaa tgaaacgata ttttcagcta ttacaaagtt atctaattcg atggctagtt 

2241 tgcccttgaa aatgtatgaa gattataaag tagttaatac agaagtatct gatttactta cagtgtcacc 

2311 gaataattct ctgagcagtt ttgattttat taatcaaatt gaaacaatca gaaatgaaaa aggtaatgca 

2381 tatgtgctaa ttgaacgaga catctatcat caaccatcaa agcttttctt attaaatcca gatgttgttg 

2451 aaatgttaat tgaaaaccaa tcacgtgaac tttattattc cattcatgct gcaactggaa ataaattgat 

2521 tgttcataat atggacatgt tgcattttaa acacatcgtg gcatctaata tggtgcaagg cattagtccg 

2591 attgatgtgt tgaagaatac aactgatttt gataatgcag taagaacctt taatcttaca gaaatgcaaa 

2661 aacctgattc tttcatgctt aaatatggtt ccaatgtagg taaagaaaaa aggcagcaag tgttagaaga 

2731 tttcaaacag tactatgaag aaaacggtgg aatattattc caagagcctg gtgttgaaat cgaaccgtta 

2801 cctaaaaaat atgtctctga agatatagcg gcaagcgaga atttaacaag agaaagagta gctaacgttt 

2871 ttcaattgcc ctcagtattc ttaaatgcaa gatcaaatac aaatttcgcg aaaaatgaag agttaaacag 

2941 attttacttg cagcatacct tattgccaat cgtcaaacag tatgaagaag aatttaatcg gaaactactt 

3011 actaaaacag acagagaaaa aaataggtat tttaaattta acgttaaatc ttatttaagg gctgatagtg 

3081 caacacaagc agaagtgtac tttaaagcag ttcgtagtgg ttactacact ataaatgaca ttagagagtg 

3151 ggaagattta ccaccagttg aaggtggaga taagccgcta ataagcggtg atttataccc aattgacacg 

3221 ccacttgaat taagaaaatc tttgaaaggt ggtgataaaa atgtcaatga aagctaagta ttttcaaatg 

3291 aaaagaaaat caaaaagtaa aggtgaaata tttatttatg gtgatattgt aagtgataaa tggtttgaaa 

3361 gtgatgtaac tgctacagat ttcaaaaata aactagatga actaggagac atcagtgaaa tagatgttca 

3431 tataaattca tctggaggca gtgtatttga agggcatgca atatacaata tgctaaaaat gcatcctgca 

3501 aaaattaata tctatgtcga tgccttagcg gcatcaattg ctagtgttat cgctatgagt ggtgacacta 

3571 tttttatgca caaaaatagt tttttaatga ttcataattc atgggttatg actgtaggta atgcagaaga 

3641 gttaagaaag acagcggatt tacttgaaaa aacagatgct gttagtaatt cagcttattt agataaagca 

3711 aaagatttag atcaagaaca cttaaaacag atgttagatg cagaaacttg gcttactgca gaagaagcct 

3781 tgtctttcgg cttgatagat gaaattttag gagctaatga aataactgct agtatctcta aagagcaata 

3851 taagcgtttc gagaacgtcc cagaagattt aaagaaagat gtagacaaaa tcactaaaat cgatgatgta 

3921 gatacgtttg aattggttga aacacctaaa gaaagtatgt cactagaaga aaaagaaaaa agagaaaaaa 

3991 ttaaacgcga atgcgaaatt ttaaaaatga caatgagtta ttaggaggaa atgaaatgcc gacattatat 

4061 gaattaaaac aatccttagg tatgattgga caacaattaa aaaataaaaa tgatgaattg agtcagaaag 

4131 caacagaccc aaatattgat atggaagaca tcaaacaact agaaacagaa aaagcaggct tacaacaaag 

4201 atttaacatt gttgaaagac aagtaaaaga cattgaagaa aaagaaaaag cgaaagttaa agacacagga 

4271 gaagcttatc aatctttaaa tgatcatgag aagatggtta aagctaaggc agagttttat cgtcacgcga _ 

4341 ttttaccaaa tgaatttgaa aaaccttcaa tggaggcaca acgtttatta cacgctttac caacaggtaa 

4411 tgattcaggt ggtgataagc tcttaccaaa aacactttct aaagaaattg tttcagaacc atttgctaaa 

4481 aaccaattac gtgaaaaagc tcgtctaact aacattaaag gtttagagat tccaagagtt tcatatactt 

4551 tagacgatga tgacttcatt acagatgtag aaacagcaaa agaattaaaa ttaaaaggtg atacagttaa 

4621 attcactact aataaattca aagtatttgc tgcaatttca gatactgtaa ttcatggatc agatgtagat 

4691 ttagtaaact gggttgaaaa cgcactacaa tcaggtctag cagctaaaga acgtaaagat gccttagcag 



WO 00/32825 



PCT/1B99/02040 



176 

4 761 taagtcctaa atctggatta gatcacatgt cattttacaa tggatctgtt aaagaagttg agggagcaga 

4 831 catgtatgat gctattatta acgctttagc agatttacat gaagattacc gtgataacgc aacaatttat 

4901 atgcgatatg cggattatgt caaaattatc agtgttcttt caaatggaac aacaaatttc tttgacacac 

4971 cagcagaaaa agtatttggc aaaccagtag tatttacaga tgcagcagtt aaacctattg tgggagattt 

5041 caattatttt ggaattaact atgatggaac aacttatgac actgataaag atgttaaaaa aggcgaatat 

5111 ttgtttgtat taactgcatg gtatgatcag caacgtacat tagacagtgc attcagaatt gcaaaagcaa 

5181 aagaaaatac aggttcatta cccagctaag ccccaaaagg ttaatgtaac agctaaggct aaatcagctg 

5251 taatatcagc cgaatagggg tgatgaaatg agtttagaag aaattaaatt gtggttgaga attgactata 

5321 atttcgaaaa tgatttaatt gaaggtctca ttcaatcggc taagtctgaa ttactattaa gtggggttcc 

5391 agattatgac aaagatgact tggaataccc gcttttttgt acagcgatta gatatatcat tgcaagagat 

5461 tatgaaagtc gtgggtactc aaatgaccaa tctagaagca aggtttttaa tgaaaaggga ttgcaaaaaa 

5531 tgattctgaa attaaaaaag tggtaggtga tttttaaatg gaatttaatg aatttaaaga tcgcgcatat 

5601 ttttttcaat atgtaaataa agggccgtat ccagatgaag aggaaaaaat gaagttgtat agttgctttt 

5671 gtaaaatata taatccttct atgaaagata gagaaatttt aaaagcgact gaatcaaagt caggactaac 

5741 cataattatg aggtcttcta aaattgaata tctaccacaa acaaatcact tagttaaaat tgacagaggc 

5811 ttatattccg ataaattatt caacattaaa gaaataagaa ttgatacacc agatattggc tataatacag 

5881 tggttttatc agaaaaatga gtgtagaaat taaagggata cctgaagtgt tgaagaaatt agaatcggta 

5951 tacggtaaac aatcaatgca agctaagagt gatagagctt taaatgaagc atctgaattt tttataaagg 

6021 ctttaaagaa agaattcgag agttttaaag atacgggtgc tagcatagaa gaaatgacta aatctaagcc 

6091 ttatacaaaa gtaggaagtc aagaaagagc tgttttaatt gaatgggtag gccctatgaa tcgcaaaaac 

6161 attattcact tgaatgaaca tggttataca agagatggaa aaaaatatac accaagaggt tttggagtta 

6231 ttgcaaaaac at t age t get aatgaacgga agtatagaga aattataaaa aaggagttgg ccagataaat 

6301 gaatatatta aacaccataa aagaaatttt attatctgat gcagagctcc aaacatatat aaattctaga 

6371 atatactatt ataaagtcac tgaaaatget gaaacttcca aaccttttgt tgttattaca cctatttatg 

6441 atttaccttc agacttcatg tctgataaat atcttagtga agaatactta attcaaatag atgtagaatc 

6511 ttcaaataat cagaaaacaa ttgatataac aaaacgaata agatatctgt tatatcaaca aaatttaatt 

6581 caagcatcta gtcagttaga tgcttatttt gaagaaacta aacgttatgt gatgtcgaga cgttatcaag 

6651 gcataccaaa aaatatatat tataaaaatc agegcatega ataggtgtgc tttttaattt ttaaggagga 

6721 aataagcaat ggcagaagga caaggttctt ataaagtagg ttttaaaaga ttatacgttg gagtttttaa 

6791 cccagaagca acaaaagtag ttaaaegcat gacatgggaa gatgaaaaag gtggtacagt tgatctaaat 

6861 atcacaggtt tagcaccaga tttagtagat atgtttgcat ctaacaaacg tgtttggatg aaaaaacaag 

6931 gtactaatga agttaagtct gacatgagta tttttaatat tccaagtgaa gatctaaata cagttattgg 

7001 tcgttctaaa gataaaaatg gtacatcttg ggtaggagag aatacaagag caccatacgt aacagttatt 

7071 ggagaatctg aagatggttt aacaggtcaa ccagtgtacg ttgegctact taaaggtact tttagcttgg 

7141 attcaattga atttaaaaca cgaggagaaa aagcagaagc accagagcca acaaaattaa ctggtgactg 

7211 gatgaacaga aaagttgatg ttgatggtac tccacaaggt attgtatacg ggtatcatga aggtaaagaa 

7281 ggagaagcag aattcttcaa aaaagtattc gttggataca cggacagtga agatcattca gaggattctg 

7351 caagttcgtt acccagctaa cccccaaaat gttgaagtag cagttaattc aaaatctgea acagtttcag 

7421 cagaataggg gctttcaaaa taaatcaaag gagaataatt tatgactaaa actttaaagg tttataaagg 

7491 agacgaegtc gtagcttctg aacaaggtga aggcaaagtg tcagtaactt tatctaattt agaageggat 

7561 acaacttatc caaaaggtac ttaccaagtg gcatgggaag aaaatggtaa agaatctagt aaagttgatg 

7631 tacctcaatt caaaaccaat ccaattctag tetcaggegt atcatttaca cccgaaacta aatcaatcac 

7701 ggtaaatget gatgacaatg ttgaaccaaa cattgcacca agtacagcaa cgaataaaac gttgaaatat 

7771 acaagtgaac atccagagtt tgttactgtt gatgagagaa caggagcaat tcacggtgta gctgagggaa 

7841 cttcagttat cactgctacg tctactgacg gaagtgacaa gtctggacaa attacagtaa cagtaacaaa 

7911 tggataatta tttgagaege agaatatctg cgtctttttt atttgaataa aaggagctaa tacaatgatt 

7981 aaatttgaaa ttaaagaccg taaaacagga aaaacagaga gctatacaaa agaagatgtg acaatgggcg 

8051 aagcagaaaa atgetatgag tatttagaat tagtaaatca agagaataaa aaagaagtac ctaacgcaac 

8121 aaaaatgaga caaaaagagc gacagttatt agtagattta tttaaagatg aaggattgac tgaagaagat 

8191 gttttgaaca agatgagcac taaaacttat acaaaagect tgaaagatat atttcgagaa atcaatggtg 

8261 aagatgaaga agattcagaa actgaaccag aagagatggg aaagacagaa gaacaatctc aataaaagat 

8331 attttatcga acattaagaa aatacaacgt ttctgtatgg agcagtatgg gtggacatta actgaagtca 

8401 gaaaacagee gtatgtaaaa cttttagaaa tacttaatga agagaataaa gaagagactg aagaaaaaca 

8471 aagtgaacaa aaagtcatta caggtaegga tttaagaaaa ctttttggaa gctagaaagg aggttaatat 

8541 gaatgaaaaa gtagaaggca tgaccttgga gctgaaatta gaccatttag gtgtccaaga aggcatgaag 

8611 ggtttaaagc gacaattagg tgttgttaat agtgaaatga aagctaatct gtcatcattt gataagtctg 

8681 aaaaatcaat ggaaaagtat caggegagaa ttaaggggtt aaatgataag cttaaagttc aaaaaaagat 

8751 gtattctcaa gtagaagatg agcttaaaca agttaacgct aattatcaaa aagctaaatc tagtgtaaaa 

8821 gatgttgaga aagcatattt aaagctagta gaagctaata aaaaagaaaa attagctctt gataaatcta 

8891 aagaagcett aaaatcttcg aatacagaac ttaaaaaagc tgaaaatcaa tataaaegta caaatcaacg 

8961 taaacaagat gcatatcaaa aacttaaaca gttgagagat gcagaacaaa agcttaagaa tagtaaccaa 

9031 gctactactg cacaactaaa aagagcaagt gaegcagtae agaagcagtc cgctaagcat aaagcacttg 

9101 ttgaacaata taaacaagaa ggcaatcaag ttcaaaaact aaaagtacaa aatgataatc tttcaaaatc 

9171 aaacgaaaaa atagaaaatt ettaegctaa aactaatact aaattaaagc aaacagaaaa agaatttaat 

9241 gatttaaata atactattaa gaatcatagc gctaatgtcg caaaagctga aacagctgtt aacaaagaaa 

9311 aagctgcttt aaataattta gagcgttcaa tagataaagc ttcatccgaa atgaagactt ttaacaaaga 

9381 acaaatgata gctcaaagtc attteggcaa acttgetagt caageggatg tcatgtcaaa gaaatttagt 

9451 tctattggag ataaaatgac ttccctagga cgtacgatga cgatgggcgt atctacaccg attactttag 

9521 ggttaggtgc agcattaaaa acaagtgcag acttcgaagg gcaaatgtct cgagttggag egattgeaea 

9591 agcaagcagt aaagacttaa aaagcatgtc taatcaagcg gttgacttag gegctaaaac aagtaaaagt _ 

9661 gctaacgaag ttgetaaagg tatggaagaa ttggcagctt taggctttaa tgccaaacaa acaatggagg 

9731 etatgeeggg tgttatcagt gcagcagaag caagcggtgc agaaatggct acaactgeaa ctgtaatggc 

9801 atcagcaatt aattctttcg gtttaaaagc atetgatgea aaccatgttg ctgatttact tgegagatea 

9871 gctaatgata gtgetgeaga tattcaatac atgggagatg cattaaaata tgcaggtact ccagcaaaag 

9941 cattaggagt ttcaatagag gacacttctg cagcaattga agttttatct aactcagggt tagaggggtc 

10011 tcaagcaggt actgeattaa gagcttegtt tattaggcta gctaatccaa gtaaaagtac agctaaggaa 
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15401 cttgttttag atggtgtgta tgcatatcga gatataaata gagtgggaat tgatacaaat agaggcatta 

15471 taacattagc gccaggtaaa aatgaattta agattaaagg agacatcagt gatattaaaa ctacatttaa 

15541 gtttcctttt atttataggt aggtgattta atggattatc atgatcattt atcagtaatg gattttaatg 

15611 aattgatttg tgaaaattta ctagatgtag attatggttc ttttaaagaa tattatgaac tgaatgaagc 

15681 taggtacatc acttttacag tttatagaac tactcataat agttttgttt tcgatttact aatttgtgaa 

15751 aacttcataa tttatcatgg tgaaaaatac acaattaagc agacagcgcc aaaggttgaa ggtgataaag 

15821 tttttattga agttacggca tatcacataa tgtatgaatt tcaaaatcac tcagtggaat caaataagct 

15891 tgatgacgac agtagcgaaa ctggtaaaac gccagaatac tctttagatg agtacttaag atatggattt 

15961 gcaaatcaaa aaacttcggt caaaatgacc tataaaataa ttggaaattt taagcgaaaa gtaccgattg 

16031 acgaattagg taacaaaaac ggcttagaat actgtaaaga agcggtagac ctatttggct gtataattta 

16101 cccaaatgat acggagatat gtttttattc tcctgaaaca ttttatcaaa gaagcgagaa agtgattcga 

16171 tatcaatata atactgatac tgtatctgca actgtcagta cattggaatt aagaacagct ataaaagttt 

16241 ttggaaaaaa gtatacagct gaggaaaaga aaaattataa tcctattaga acaactgaca ttaaatattc 

16311 aaatggtttt ataaaagaag gtacttatcg taccgcaaca attgggtcta aagctactat taactttgat 

16381 tgcaagtatg gtaatgaaac agttagattt acaataaaaa agggctctca aggtggaata tataagttga 

16451 ttttagacgg caagcaaatt aagcaaattt cttgttttgc taagtcggtt cagtctgaaa caatagattt 

16521 aataaaaaat attgataaag gcaagcacgt tttagaaatg atatttttag gagaagaccc caaaaataga 

16591 attgatatat cttcaaataa aaaagctaag ccttgtatgt atgttggaac tgaaaaatca acagtcttaa 

16661 atttaattgc tgacaactca ggtcgcaatc aatacaaagc aattgttgac tacgtcgcag atagtgcaaa 

16731 gcagtttggg attcgatatg ctaatacgca aacaaatgaa gatatcgaaa cacaggataa gctgttagaa 

16801 tttgcaaaaa agcaaataaa tgatactcct aagactgaat tagatgttaa ttatataggt tatgaaaaaa 

16871 tagagccaag agatagcgta ttctttgttc atgaattaat gggatataac actgaattaa aggttgttaa 

16941 acttgatagg tcacatccat ttgtaaacgc aatagatgaa gtgtctttca gcaatgaaat aaaggatatg 

17011 gtacaaattc aacaagcgct taacagacga gttattgcac aagataatag atataactat caagcaaatc 

17081 gtataaatca tttatacact agtactttga attctccttt cgagacaatg gatataggga gtgtattaat 

17151 ataatggcaa cagaagaagt taaaatcaaa gcgctacttg aaaacgataa acagtacttt ccagctacac 

17221 attggaaagc tataaatggg ataccttatg caggcagtag tgatattgat ggattgcctc aagacggtat 

17291 catttcggta gatgataaaa ataaattaga taatttaaaa ataggcgaag caggaattat tcaaaatagc 

17361 attgtacaga aatccccaaa cggtaaattg tggaaaataa cagttgacga tagtgggaaa cttggtacag 

17431 tgctatttta ttagaaagga aggtgcatta tggaaaattt gtatttaata aaggatttgg gagctttagc 

17501 aggtcgagat tatagagcta aggaaataca aaacttacaa agaatagagc aatttgcgct tggcttgaca 

17571 acagagttta agttgcatca gaaagctaaa acaattcaac acttcgctga gcaaatttat tataatggta 

17641 gatcgcaagc agcagtaaac aaatctttac aaagtcaaat taacgcactt gttgtggcac cacgtaataa 

17711 cagtgctaat gagattgttc aagctcgagt taatgtaaac ggcgaaacct ttgacacatt aaaagaacat 

17781 ttagacgatt gggaaaccca aactcaaatt aataaagagg aaactataag agaattaaat aagaccaaac 

17851 aagaaattct tgatatcgag tatcgttttg aacctgataa gcaagaattt ttatttgtga cagaacttgc 

17921 acctcttaca aatgcagtaa tgcaatcctt ctggtttgat aatagaacag gcatagtata catgacacaa 

17991 gctagaaata atggctatat gctaagtcgt ctaagaccta atggtcaatt tatagacagc tcattgattg 

18061 taggtggggg tcatggtaca cataacggtt atagatatat tgatgatgag ttatggattt atagttttat 

18131 cttaaatggt aataatgaga atacattagt tcgtttcaag tatacgccta atgtggaaat tagctatggc 

18201 aagtatggta tgcaagatgt atttacagga cacccagaaa aaccctacat cacccctgtc ataaatgaaa 

18271 aagaaaataa aattctatac agaattgaga gacctagaag tcactgggaa cttgaaaact caatgaatta 

18341 tatagagata agaagtttag acgatgttga taaaaatatt gataaagttt tgcataaaat cagtatccct 

18411 atgagactaa caaacgaaac ccaaccaatg cagggtgtga cttttgatga aaaatacttg tattggtata 

18481 caggagacag taatccaaat aatagaaact atttaacggc tttcgattta gaaacaggag aagaagcgta 

18551 tcaggttaat gctgactatg gtggaacact agattcattt cctggcgaat ttgcggaagc agaaggtttg 

18621 caaatatact atgacaaaga tagtggtaaa aaagctttga tgctaggtgt tactgtcggt ggtgatggaa 

18691 atagaacaca tcgtattttc atgattgggc aaagaggtat tttagaaata cttcactcaa gaggcgttcc 

18761 ttttatcatg agtgacacag gtggtagagt taaaccttta ccaatgaggc ctgataaact taagaatctt 

18831 gggatgttaa cagagccagg tctttactat ttatacactg atcatacagt tcaaatcgat gatttcccat 

18901 taccaagaga atggcgtgat gcaggttggt tcttggaagt taagccacca caaactggcg gtgatgtaat 

18971 tcagatattg acgcgtaata gttatgcaag gaatatgatg acttttgaaa gggtgctttc tggaagaact 

19041 ggagacattt cggactggaa ttatgtgcct aaaaatagtg gtaaatggga gagagtacct tcattcatca 

19111 caaaaatgtc agatattaac atagtaggca tgtcgtttta tttaactacg gatgatacaa aacgttttac 

19181 agattttcca actgaacgta aaggggtagc tggttggaac ttatatgtag aagcttcaaa cacaggtggc 

19251 tttgttcata ggctagttcg taatagtgtt acagcatctg ctgagatact attgaaaaat tatgatagta 

19321 aaacaagttc agggccatgg actttacacg aagggagaat tataagttaa tgagtaattt agagaaatct 

19391 gtagctataa atttagaaaa cacagcgcat tatgaaaata tttcaaatct agatataact tttagaacag 

19461 gagagagtga ttcttctgtt cttcttttta atatcactaa aaataatcaa ccgttattat tgagtgaaga 

19531 aaatatcaaa gcacgaatag cgattcgagg taaaggagtc atggtagttg ctccactaga aatattagat 

19601 ccatttaaag gtattttaaa atttcaatta cctaatgatg taattaaacg agatggaagt tatcaagctc 

19671 aagtttcggt tgcagaatta ggtaattcag acgtggtagt tgtcgagaga actatcacat ttaacgttga 

19741 aaaaagtttg tttagcatga ttccatctga aacaaaatta cactatattg ttgaatttca ggaattagaa 

19811 aaaactacta tggatcgtgc gaaagcaatg gacgaggcta taaaaaatgg tgaagattat gcgagtctga 

19881 tcgaaaaagc taaagaaaaa ggtctatcag atattcaaat agcaaaatct tcaagtatag atgaattaaa 

19951 gcaacttgct aatagccata tatctgattt ggaaaataaa gcgcaagcat attcaagaac attcgatgag 

20021 caaaagcgat atatggatga gaaacatgaa gccttcaagc agtcagtgaa tagtggtggt ttagtcacaa 

20091 gtggttctac ttcaaattgg caaaaagcta agattactaa agatgatggt aagataatgc agattactgg 

20161 atttgatttt aataatccag aacaaagaat aggtgatcca acccaattta tttatgtttc gcaagctata 

20231 aattatccaa gaggtgttag tactaacggt actgtcgaat atttagtagt aacttcagat tacaagcgta- 

20301 tgacttatcg accgaacggt acaaataaag tgtttgttaa aagaaaagaa gcgggttcat ggtctgagtg 

20371 gtcagaatta gctattaatg attacaatac accttttgaa actgttcaaa gtgcccaatc aaaagctaat 

20441 atggccgaaa gtaacgctaa attatacgca gatgacaagt ttaataaaag gtattcggtt atttttgatg 

20511 gaacagcaaa tggtgtgggc tctacattgt acttaaatga gagtttagac caatttattt tattaatttt 

20581 ttatgggact tttccaggtg gtgactttac agagtttggc agtccttttg gaggaggaaa gatttcattg 

20651 aatccctcaa atcttccaga tggtgatgga aatggtggag gtgtttatga gtttggatta actaaatcta 
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20721 gtcgtacatc tttaactata tcaaacgatg tctatttcga cttaggaagt caaagaggct ctggtgcgaa 

20791 cgcaaataga oagacaatta acaaaattat aggagtgaga aaataatgca aatattagtt aacaagcgta 

20861 atgagataat ttcatacgct atcattggtg gctttgaaga aggtattgat actgaaaatt taccagaaaa 

20931 tttctctcaa gtttttagac ctaaagcctt taaatattca aatggggaaa tagtttttaa cgaagattat 

21001 tcagaagaaa aagatgactt gcatcaacag attgacagtg aagaacaaaa cacagtcgct tctgacgaca 

21071 tcttacgaaa aatggttgct agtatgcaga aacaagttgt tcaaagtaca aagctatcga tgcaagttaa 

21141 taagcaaaat gcactaatgg caaaacaact tgtgacactt aataaaaaat tagaagaggt taaaggagag 

21211 actgaaaatg cttaaattaa tttcaccaac attcgaagat attaaaacat ggtatcaatt gaaagaatat 

21281 agtaaagaag acatagcgtg gtatgtagat atggaagtta tagataaaga ggaatatgca attattacag 

21351 gagaaaagta tccagaaaat ctagagtcat aggttataat cttatggctt tttaatttga ataaagtggg 

21421 tggtgtaatg tttggattta ccaaacgaca cgaacaagat tggcgtttaa cgcgattaga agaaaatgat 

21491 aagactatgt ttgaaaaatt cgacagaata gaagacagtc tgagaacgca agaaaaaatt tatgacaagt 

21561 tagatagaaa tttcgaagaa ctaaggcgtg acaaagaaga agatgaaaaa aataaagaga aaaatgctaa 

21631 aaatattaga gacatcaaga tgtggattct aggattaata gggacgattc taagtacatt tgttatagcc 

21701 ttgttaaaaa ctatttttgg catttaaagg aggtgattac catgcttaag ggaattttag gatatagctt 

21771 ttggtcgtgt ttctggttta gtaagtgtaa gtaatagtta agagtcagtg cttcggcact ggctttttat 

21841 tttggaaaaa aggagcaaac aaatggatgc aaaagtaata acaagataca tcgtattgat cttagcatta 

21911 gtaaatcaat tcttagcgaa caaaggtatt agcccgattc cagtagacga tgagaatata tcatcaataa 

21981 tacttactgt tgttgcttta tatactacgt ataaagacaa tccaacatct caagaaggta aatgggcaaa 

22051 tcaaaagcta aagaaatata aagctgaaaa caagtataga aaagcaacag ggcaagcgcc aattaaagaa 

22121 gtaatgacac ctacgaatat gaacgacaca aatgatttag ggtaggtgtt gaccaatgtt gataacaaaa 

22191 aaccaagcag aaaaatggtt tgataattca ttagggaagc agttcaatcc tgatttgttt tatggatttc 

22261 agtgttacga ttacgcaaat atgtttttta tgatagcaac aggcgaaagg ttacaaggtt tatacgctta 

22331 taatattcca tttgataata aagcaaggat tgaaaaatac gggcaaataa ttaaaaacta tgatagcttt 

22401 ttaccgcaaa agttggacat tgtcgttttc ccgtcaaagt atggtggcgg agctggacat gttgaaattg 

22471 ttgagagcgc taatctaaac actttcacat cgtttggcca aaattggaat ggtaaaggtt ggacaaatgg 

22541 cgttgcgcaa cctggttggg gtcccgaaac cgttacaaga catgttcatt attacgatga cccaatgtat 

22611 tttattagat taaatttccc agataaagta agtgttggag ataaagctaa aagcgttatt aagcaagcaa 

22681 ctgccaaaaa gcaagcagta attaaaccta aaaaaattat gcttgtagcc ggtcatggtt ataacgatcc 

22751 tggagcagta ggaaacggaa caaacgaacg cgattttata cgtaaatata taacgccaaa tatcgctaag 

22821 tatttaagac atgccggtca tgaagtcgca ttatatggtg gctcaagtca atcacaagac atgtatcaag 

22891 atacagcata cggtgttaat gtaggtaata aaaaagatta tggcttatat tgggttaaat cacaggggta 

22961 tgacattgtt ctagaaatac atttagacgc agcaggagaa agcgcaagtg gtgggcatgt tattatctca 

23031 agtcaattca atgcagatac tattgataaa agtatacaag atgttattaa aaataactta ggacaaataa 

23101 gaggtgtaac acctcgtaac gatttactaa atgttaacgt atcagcagaa ataaatataa attatcgctt 

23171 atctgaatta ggttttatca ctaataaaaa tgatatggat tggattaaga aaaactatga cttgtattct 

23241 aaattaatag ccggtgcgat tcatggtaag cctatcggtg gtgtgatatc tagtgaggtt aaaacaccag 

23311 ttaaaaacga aaagaatccg ccagtgccag caggttatac acccgataaa aataatgtac cgtataaaaa 

23381 agaaactggt tattacacag ttgccaatgt taaaggtaat aacgtaaggg acggctattc aactaattca 

23451 agaattactg gtgtattacc taataacgca acaatcaaat atgacggcgc atattgtatc aatggctata 

23521 gatggattac ttatattgct aatagtggac aacgtcgtta tattgctaca ggagaggtag acaaggcagg 

23591 taatagaata agcagttttg gtaagtttag tgcagtttga taattgtata tgatgaatct taggcaggta 

23661 cttcggtact tgcctattat ttaaaattaa taaacagtta atttttacat gaatatatta aattttaaaa 

23731 aaacaaacgt ttttagtata taaattattt tgtgttcgta ttgtgtgcta tgattaaaaa gttgttatgg 

23801 tcaactatat cgtggtttta tgtttattat caatcaaaat ataaattatt tataatttgt ttggtaatga 

23871 acgggttttt ttcgaaataa tagtaaaaaa acacatctgt agatatttta aactcggtaa atcttttaat 

23941 aaatatttaa ttttattaaa agttaaaaag gtttaatata aaaatgtaat aaaatttata aagaaaggaa 

24 011 atgattttta tggtcaaaaa aagactatta gctgcaacat tgtcgttagg aataatcact cctattgcta 

24081 cttcgtttca tgaatctaaa gctgataaca atattgagaa tattggtgat ggcgctgagg tagtcaaaag 

24151 aacagaagat acaagtagcg ataagtgggg ggtcacacaa aatattcagt ttgattttgt taaagataaa 

24221 aagtataaca aagacgcttt gatttcaaaa atgcaaggtt ttatcaattc aaagactact tattacaatt 

24291 acaaaaacac agatcatata aaagcaatga ggtggccttt ccaatacaat attggtctca aaacaaatga 

24361 ccccaatgta gatttaataa attatctacc taaaaataaa atagattcag taaatgttag tcaaacatta 

24431 ggttataaca taggtggtaa ttttaatagt ggtccatcaa caggaggtaa tggttcattt aattattcaa 

24501 aaacaattag ttataataaa ataaaaagta ggtgataaga tgactcaatt tctaggggcg cttcttctta 

24571 caggagtttt aggttacata ccatataaat atctaacaat gataggttta gttagtgaaa aaaacaaggt 

24641 tatcaatact cctgtattat tgattttttc tattgaaaca tgtttgatat ggttttatag ttttataatt 

24711 tttaataatg ttgatttaaa aaatttgaat ttaattcagt tgcttacagg tctaaaagca aatattttgt 

24781 ttctatttat ttttgtttta acagtgtttg tatttaatcc tttaattgtt aaatttatta tctggttaat 

24851 taatataacc agaaagttta tgaaattgga ttgtataagc ttattagaca aaagagacaa gttgtttaat 

24921 aacaacggta aaccagtatt tatagttata aaagactttg aaaacagaat cattgaagag ggtgaactta 

24991 aaacctataa ttcagctggt agcgatttcg atttactaga agttgagcga caagatttca aagtatctga 

25061 tttaccgtca aacgatgaat tgtatattaa acatacactt gtagacctta aacaacaaat taaattggat 

25131 ttatatttaa tgaatgaata ctaatctttt ttcttagctt tttctgataa agtgcttttt aatttttcgc 

25201 tggcgcccgg cttttcaaaa cttttgttta ttgggttact acgagtagct tcttgttttt tgtttttatc 

25271 cgccataaaa ttctcaccac cattcaacgt ctacacttgt aggcgttttt ttatttagta aagtcataat 

25341 gaatcttctt tggttaactt atctccatct attttttgtg aaataaattc caagtattta cgcgcattat 

25411 gtgacgataa atctttaggt aactcataag tgaatggttg attaccacta gttaaaactt catatactat 

25481 agtttctttt tttattctgc aattagttat tttcattata aacttccttt caaacactgc tgaaatagac 

25551 gtcttttata ttaaagcgcc acacaggcgc tgttaatcac aatacaactt tgcccattac tttaatatta^ 

25621 ctaaacgaag cgactttgat atcatcatac ttcggattta gagataccaa attaatatag tcttcgcata 

25691 tatctacacg cttgataaga cttactccat ctaatacaac gagtgcaatt gtaccatctt taatagaatc 

25761 ttctttctta ataaaagcgt atgttccttg ttttaacata ggttccattg aatcaccatt aactaaaata 

25831 caaaaatcag catttgatgg cgtttcgtct tctttaaaaa atacttcttc atgcaatatg tcatcatata 

25901 attcttctcc tatgccagca ccagttgcac cacatgcaat atacgatact agtttagact ctttatatcc 

25971 atctatagaa gtgactttat tctgttcttc caattgttca tttgcatagt taagtacgtt ttcttggcgg 
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26041 ggaggtgtga gtttgttgta tatggaagtg atgccgctac cgtctttgta tgtagtattt gattcactat 

26111 acaaatcatt aatcttcaca ttgaagtact cagccaaaat tttggcagtt gataatcgag gttcttcctt 

26181 ttcattttcc cattttgata tcttgccttt cgttaatttc attaagtcgg gatatttatt attaagatca 

26251 gttgctaatt gttccatagt catattttta tttttttctt agcttcttta aaccttcacc aatacccata 

26321 cgaaaccctc cttatataag ataatttcat tataaaagtt tcgaaaacga aacgcaagga aaatattatt 

26391 gcaaaagttg ttgacatcga aacttttatg atgtattctt aaatcaagtt gttacaaacg aaacaaaagg 

26461 agggggttca atgacaacta gtgtagcaga taaaccatac ttaaaaataa aaagcttgat tgcacttaaa 

26531 ggaactaacc aaaaagaagt tgctaaagca atcggaatga gtagaagttt attgagtata aagataaatc 

26601 gaattaatgg cagagacttt acaacttcag aagccaaaaa attagcagat catttaaatg ttaaagttga 

26671 tgattttttt taaactttaa gtttcgaaag tgacaactaa ataaaaataa ggaggacact atggaacaaa 

26741 taacgttaac caaagaagag ttgaaagaaa ttatagcgaa agaagttaga aatgctataa aaggcgagaa 

26811 accaatcagc tcaggtgcaa ttttcagtaa agtaagaatc aataatgacg atttagaaga aatcaataaa 

26881 aaactcaatt tegcaaaaga tttgtcgcta ggaagattga ggaagctcaa tcatccgatt ccgctaaaaa 

26951 agtatcagca tggcttcgaa tcaattcatc aaaaagctta tgtacaagat gttcatgacc atattagaaa 

27021 attaacatta tcaatttttg gagtgacact taattcagac ttgagtgaaa gtgaatacaa cctagcagca 

27091 aaaatttata gagatatcaa aaactattat ttatatatct atgaaaagag agtttcagaa ttaactatcg 

27161 atgatttcga atgaaggagg aactacaaat gaaactacta agaaggctat tcaataaaaa acacgaaaac 

27231 ttaattgacg tgtggcatgg aaatcaatgg ttaaaagtga aagaaagcaa attaaaaaaa tataaagtgg 

27301 tctcggatag agaaggtaag aaatatctaa ttaaataagc gcacttaatt agtgcaagta atcaagtgcg 

27371 ctattgcctt acaatcctaa atcttttctg cttttttctt cttcttgtaa tcccaataac acagaagagt 

27441 aaatgctgaa atagtcacga gcaacgctat ctttagcgaa tgcaattacg tcatcaccga cttcttgcca 

27511 ttcgttatga atcttatgtc tatctagagc tctaggtaat agcgagattg taatatcgtg agcaattttc 

27581 tctaaatcca taaatttcac ctccttccac tgggagataa ctaaattata taacaaaaca acttaaagga 

27651 ggaacgacaa atgcaagctc aaaacaaaaa agtcatctat tactactatg acgaagaagg taataggcga 

27721 ccattagata ttcaaattaa tgacggatat gaactgatgg tccgatctca tttcatcaac aacaccattg 

27791 aagaaatacc atacgtaaat aataacttat atgccttggt tgatggttat gaatttaagt tagattgaat 

27861 ttttgagaaa gatattgaaa agctaatttc cccataagat taagagacat actggatgtt ttgttaacga 

27931 ctcttttaac ttcgttccaa gttttattgt ctctaatatt atcgagaaat tcatggccag accaagtgat 

28001 gtcatcaata atccaagaaa cgaccctgcc ttcgatgaat ttcagatcgc aacaaataaa tttagcttct 

28071 tctaatttta aaagtgagta cattactgtt tcaaaatcat atttatcaaa aataatatta tcgttgaaat 

28141 tatgtcgagt aagtggttca cctattttct tattagattc tatttctaag agcaagagtc taacgcaatc 

28211 gtgattaagt ttcatcctat cacctccata acaggagtat agcagaaagg atcataaaca tcttaaaagg 

28281 aggaataaca aatgaacatt caagaagcaa ctaagatagc tacaaaaaat cttgtctcta tgacacggaa 

28351 agattggaaa gaaagtcatc gaactaagat attaccaaca aatgatagtt ttttacaatg catcatttca 

28421 aatagcgatg ggacaaacct tatcagatat tggcaacctt cagccgatga cctcatggca aatgattggg 

28491 aagttataaa cccaactaga gaccaggaat tattgaagca attttagaaa tgctatcaat gatacttttt 

28561 aaattgtttt taaactcatt ttcaaagtaa acaacagtct tgtctgaaat tgttacatga taaatagtgt 

28631 tactagcata cacgccgttt aggaacccag agtttttaag tttatttaaa tcgtatttta catcttcgaa 

28701 atgtagtttt tgaaaatact ttgtatgtat atctttagca cttccaaaat tattgcaggt taatttaacc 

28771 gaacctaact ttacacattc taaataatct ttgtagagta cggacaagat atattgttgg tctttagtaa 

28841 gtgtatcaaa ttcatcagat atcaagggca tgttatcacc tccttaggtt gataacaaca ttatacacga 

28911 aaggagcata aacaaatgaa cacaagatca gaaggattgc gtataggcgt cccacaagtt tctagcaaag 

28981 ctgatgcttc ttcatcctat ttaacggaaa aggaacgtaa cttaggagcg gaaatattag agcttattaa 

29051 aaaaagtgat tacagctact tagaaataaa caaagttttc tatgcattag atagagaact tcaatacagg 

29121 gcgaataata acaaacttta acatttatct aaaggagtga tagagatgcc aaaaatcata ataccaccaa 

29191 caccagaaaa cacatatcga ggcgaagaaa aatttgtgaa aaagttatac gcaacaccta cacaaatcca 

29261 tcaattgttt ggagtatgta gaagtacagt atacaactgg ttgaaatatt accgtgaaga taatttaggt 

29331 gtagaaaatt tatacattga ttattcagca acgggaacat tgattaatat ttctaaatta gaagagtatt 

29401 tgatcagaaa gcataaaaaa tggtattagg aggattatca aatgagcgac acatataaaa gctacctatt 

29471 agcagtgttg tgcttcacgg tcttagcgat tgtactcatg ccgtttctat acttcactac agcatggtca 

29541 attgcgggat tcgcaagtat cgcaacattc atattttata aggaatactt ttatgaagaa taaagaaact 

29611 gctacttgtt ggagcaagta acagtgcaag atgagcaatt gtcttaaata attatataag gagttattaa 

29681 tatgacctta caacaaaaaa tactatcaca ttttgcaaca tatgacaatt tcaattctga tgatgttgtt 

29751 gaagtttttg ggatatctaa aacacatgca aaatccacac tttcaagact taagaaaaaa ggaaagattg 

29821 aattggaaag ttggggtatc tggcgtgttg ttgaaccgca gttacattta actgttgtag aacgtaagaa 

29891 agagatatta gaagaacaat tcgagttatt ggcaagatta aacgaacaaa gtgatgaccc tagagaaata 

29961 gaagaacgca tcaagttaat gattcgttta gccaaccaat tttaaggagg agttaatcaa tggcaatatt 

30031 agaaggtatt tttgaagaat taaaactatt aaataagaat ttacgtgtgc taaatactga actatcaact 

30101 gtagattcat caattgtaca agagaaagtt aaagaagcac caatgccaaa agatgaaaca gctcaactgg 

30171 aatcagttga agaagttaag gaaacttctg ctgatttaac taaagattat gttttatcag taggaaaaga 

30241 gttccttaaa aaagcagata cttctgataa gaaagaattt agaaataaac ttaacgaact tggtgcggat 

30311 aagctatcta ctatcaaaga agagcattat gaaaaaattg ttgattttat gaatgcgaga ataaatgcat 

30381 gaagctagat cactcaaata gagctcatgc aaagcttagt gcaagtggag caaaacaatg gctaaactgt 

30451 ccaccgagta ttaaggcaag tgaaggtatc gcagataaaa gttcagtttt tgctgaagaa ggtacattcg 

30521 ctcatgagtt aagtgagtta tatttcagtc ttaaatatga aggcctaaca cagtttgagt ttaataaagc 

30591 ttttcaaaat tataagcgaa atcaatatta cagtgaagag ttgcgcgaat atgttgaaga gtacgtagct 

30661 aatgtagaag aaaaatataa cgaagctttg agtagagatg acgatgtaat agctttattt gaaacaaaat 

30731 tggatttagg taaatacgtc cctgaatctt ttggtactgg tgatgtcatt atattttcag gtggtgtact 

30801 tgaaattatt gaccttaaat acggtaaagg cattgaagtt tcagctatag ataatcctca acttagatta 

30871 tatggcttgg gcgcatatga actgcttagt ttaatgtatg acattcatac agttcgcatg actatcatac _ 

30941 aaccacgaat agataacttt tctactgaag agttaccaat atcaagatta cttcaatggg gaaccgattt 

31011 tgttaaacca ttagccagac ttgcttataa cggtgaaggt gagtttaaag caggtagtca ttgtagattc 

31081 tgtaagataa agcattcatg tagaacacgt gcagaataca tgcaaaatgt gcctcaaaag ccaccacatt 

31151 tgttgagtga tgaagagatt gcagaacttt tatataaact gcctgacatc aaaaaatggg ctgatgaagt 

31221 agaaaaatat gcactagatc aagcgaaaga aaatgataaa aactattctg gttggaagct tgtagaaggt 

31291 cgctcgcgaa gaatgataac tgatacaaat gcaacgcttg aaaagttagt tgaagcaggt tataaacctg 
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aaaatattac agaaaccaag ttacttagca ttacgaattt agaaaaatta atcggcaaaa aagcatcttc 
31361 aagatactac a 9 aaa =« | caaa aaaqcc acaaggtaaa ctaacacttg ctaccgagtc tgataaacga 
31431 taaaattgca 9»99c"" l^^llt tttglcaaac tataaaaatc aaaaaggacg gtatataaac 
31501 ccagctataa a 9« a tctgc £9aa9 a tgat ' 9 gaaaagtaag agcatcatat gcacatattt 

31571 acgaaagcaa aa 9«" aaa Lgcaaagea Itcaatcagt ttaatcattc ctaaatcaga 

S aS^a SSSS tatagaagcc gctaaagaag aaggaaaagt tagtaagttt 
^aQgcaaag ttcctgcaaa tctgaaactt ccattacgtg atggagatac tgaaagagaa gatgatgtga 
lull a?ta?caaga cgcttltttc attlacgcat caagcaaaca agcacctggt attattgacc aaaacaaaat 
3u£ tagattaalg glttctggaa ctattgtaag tggtgactae attagagctc caatcaattt atttccattc 
Hill SSSS gtaataaggg tatcgcagc. ggattgaaca acattcaact = 

SS S£ K&S SS3£ £SS£ af tgagglgt LagaSttg aaa ttt atga 

™ aMtaoatal waaacatac agcagtaacg atatttcgaa atgtggtgtc tataaataca cagaagctga 

lllll aaatt!™aa atc^aatta tagcltattc aatagatggt ggaccgacta gtgcgattga catgactaaa 

32271 agatttcgaa a "" aa " _JL a ateat gagacgttta aaattgctct atttgaccct gctgtaaaaa 

324U SSS caSctaa? tlcglaagaa ^ttgtLtgc taaacatttt aataaacaga tgccacctga 

«!« =™aSt totacaatgg ttaattcaat gcgtattggc ttacctgctt cgcttgataa agttggagaa 

32S5l" gttttaagac tlcaaaacla aaaagataaa gcaggtaaaa atttaattcg ttatttctct ataccttgta 

1111, f™ aattaatgga ggaagaacaa gaaatttgcc tgaacatgat cttgaaaaat ggcaacaatt 

lllll tatagat^ tltattcglg algtagaagt Igaaatgaca attgctaata aaattaaaga ctttccagta 

32761 actg?aa«g aacaagcat! ttiggttttc gaccaacata taaacgacag aggtattaag ctttctaaat 

I cattga^? aggagctaat gtgltcgata agcagagtaa agaagaattg cttaaacaag ctaaacatat 

lllll n»«^tfta afaaatccta atagtcctac acagttattg gcttggttaa aggatgaaca aggattagat 

lllll ^acSaatt taSaaagaa aacggttcag gatlactcaa aagtagcaac aggaaaagct aaaaaaatgc 

33041 tagaaa^ag attgcS tctalaacca gtgtgaaaaa atacaacaaa atgcatgaca tgatgtgcag 

33?" tgl^gta^g gtalgaggtc tgtttcaatt ctacggtgcc ggtactggaa gatgggcagg 'agaggtgta 

lllll clacttcala atttaa«aa gcattatatt tcagatactg aactagaaat agcaagagat <*tattaaag 

33251" a"aac£ft tgacgattta gatetattac tcaatgttca tcctcaagac ttattaagtc aattagttag 

lllll ga^aca^tt altgctgaag Lggtaatga actagcagta agtgattttt ctgcaataga ggcaagagtc 

lllll atagcatg£ atgcaaaaga aclatggcgt ttagatgtgt tcaacacaca cggaaagata tatgaagcat 

lllll ^rrtcSa aa?atttaat gtaccggtag aaagcataac taaaggcgac cctctcagac aaaaaggaaa 

3 5" aUSccgaa ttagctttag gcta^aagg tggcgctgga gctttaaaag caatgggtgc -"ggaaatg 

lllll ggcattglag aaaacgagtt acaaggttta gttgatagtt ggcgtaacgc aaatcctaac atagttaatt 

lleil Htggaaggc ttgcclagag gctglaatta atactgtaaa atcccgaaag acgcatcata cacatggact 

IIVA taaa?t??lt atgaaaaaal gtettctaat gattgaactg cctagtggaa gagctttagc ttatccaaaa 

lllll gc?ttagttg gtgaaaatag ?tggggtagt caagttgttg aatttatggg 9«:tagatctt ^aaat 

33881 ggtcaaagt? aaaaacgtat ggtgggaagt tagtcgagaa tattgttcaa 9caactgcaa 9ggatttact 

i S5U HI 2SK SSS = 3=5 SSS 

™ KSE SS SS= S^I ^1 =ss 2 

™ ^= s:^ = ™ 9 ^gii i™ 

34301 aacacaacgt "ttta*t , a fl a< , a M- 0 aaaaaaattt caqaaaaata tcacatatct ccagaacttc 

HZl fSat C ag S« aJgtgtgga agaaaagact caaaatctaa 

3«U a^awaagtt gaatatatgc agagtcaaat aaaagatgaa gaaaaagaga gagaaaaaat cagaaaaaaa 

34S11 agatgaagtc 9 aa ""'9 » * agagcggaat atgaagaaga aagaaagaga agattgagac 

£.£ Scttfa tgatgg^acg ^acaaalac a«ca1|tga tcfgtactgg ttcgatgCca cttataacca 

3472^ attl?tcaag alatg|agtg aagcataatg agcgtaatca gtaacagaaa agtagacatg aacgaagcgc 

IVlll aaglcaatg? taaglLcca gcicactaca catacggcga cattgaaatt atagatttta tcgaacaggt 

3486^ "cggcacag tatccacctc Lctagcact cgcaataggt aatgcaataa aatacttgtc "gagcacct 

lllll ttaaaqaatg gtcatgagga cttagcaaag gcgaagtttc acgtccaaag agcttttgac ttgtgggagt 

Will gatgtcStg Icagafaglg catg^aaaga atacttaaac caattttteg 9atctaagag atatetgtat 

tXrni caaaataacq aacgagtggc acatatccat gtagtgaatg gcactcatta ctttcacggg catatogtac 

Kill SHctgtcl agglglglL aagacatttg atacagcgga agagctcgaa acatatataa a 9caacatgg 
«f^aa?ac galgLcaga agcaactaac tttattttaa ggagatagaa atgatgaaaa tcaaagttga 

3"^ aaalataatg laalwga^g alttaattaa gtgggcgcga gaaaatccgg agctatcatc tggcagaaaa 

3535^ Wttata^al cagacaaaal tgatgaaaac tttatttact tcggtgtttt taaaaattgt tttaaaataa 

llAl a^attttat attagttaac gltactttta gtgtcaaagt tgaagaagaa gtaaccgaag aaactaagtt 

»49l tSta^etg tttgaag^ aegagattea agaaggagtc tataaatctg catcatatga gaatgctagt 

35491 tgataggccg c "9aagtgu a » tttcttgcta aagcattcta catcttaaac gaegacctaa 

3SS61 ataaacgaac gtttaaaaaa tgacagaatt "cctcgc » gttcaaaaga atattacgaa 

35631 ctatgacgct aa "tggaaa 9aaggagagt tgattaaata atggaa ^ g ^ atcggagata 

I 7 771 S aa 9agag?gc SSSSt tggaglagaa agcalgtgca t ggga t aggt attgcaagag 
lllll ™«gaaaaa gaftLItL acgaatttgg caaagaeggt gaaagagtta aatttggaat SSaattaaae 
3M11 aataaaattt ttatggagga agacgcaaat gaataaccgc gaacaaatcg aacaatcagt tactagtgct 

agcg^gtata aSSSga cLagaggga ttattaaaag agattgagga cgtgtataag aaa 9C9caag 
Iclli ^Pflataa aatacttaag ggttcaccta atgctatgca agatgcaatc aaagaagata etggtcttga 

^ao!ao?a oaaattawl Hggtcaagt tgtctataaa tatgaggagg agcaggaaaa tgactaacat 
lllll attacaaotg aaactattat Saaagaclc tlgaatgcca gaacgaaatc acaagacgga tgcaggttat 
lllll gacatatft? cagctaaaac tgtcgfacft gagccacaag aaaaggcagt gatcaaaaca 9atgtagctg 
lllll waqcattcc aglgggctat gtcggtttat taactagccg tagtggtgta agtagtaaaa cgcatttagt 
lllll gf^gatag Lgg- J-JJ- ^ a 999 a « a -tcaaga. cgacaa.gaa 

36471 acgttagaga 9tgaggatat 9agtaact« 9gecgg a 9tc a 99 aat ^ aactagctca 
3 366li SS^ £££ 9gacf=ccga ac^aagcaa gtggaggaa, t cgagag tgt tt cagaa=g t 
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36681 ggagcaaaag gcttcggaag tagcggagtg caaagacata ttagatcgag tcaaggaggt tttggggaag 

36751 tgagtgacac gttagaaata tttttcatag ggtttggtgt ttatctattt tgtcgcatag gtattatttt 

36821 tctcaagagt aaaaagacta tacacacaaa cctatatgaa atgctgttga ttgctactat ctttgtgaca 

36891 tctacatttg ctgataaaca tcaaaagacg catatcttaa tagcattttt agtaatgttt tttatgagta 

36961 agctcaaaca agttcaaggg agctatgagg aatgacacaa tacctagtca caacatttaa agattcaaca 

37031 ggacgtaagc atacacacat aactaaagct aagagcaatc aaaggtttac agttgttgat gcggagagta 

37101 aagaagaagc gaaagagaag tacgaggcac aagctaaaag aaatgcagtt attaaattag ggcagttgtt 

37171 tgaaaatata agggagtgtg ggaaacgact aaacaaatac taagattatt attcttacta gcgatgtatg 

37241 agctaggcaa gtatgtaact gagcaagtat atattatgat gacggctaat gatgatgcag aggcgccgag 

37311 tgactttgaa aaaatcagag ctgaagtttc atggtaatag ctattatcat ttttgaatta attatattaa 

37381 tgtgtttagc aatagcactg gaggtgttgt aaatatgtgg attgtcattt caattgtttt atctatattt 

37451 ttattgatct tgttaagtag catttctcat aagatgaaaa ccatagaagc attggagtat atgaatgctt 

37521 atcttttcaa gcagttagta aaaaataatg gtgttgaagg tatagaagat tatgaaaatg aagttgaacg 

37591 aattagaaaa agatttaaaa gctaaagaga ggcgttggct tctctgttct atttaaaata atgaaaggag 

37661 ccgaacatgt tagacaaagt cactcaaata gaaacaatta aatatgatcg tgatgtttca tattcttatg 

37731 ctgctagtcg tttatctaca cattggacta atcacaatat ggcttggtct gactttatgc agaagctagc 

37801 acaaacagtt agaactaaag aagatttaac tgagtacaat aaaatgtcta agtctgaaca agccgatata 

37871 aaagatgttg gcggatttgt cggtggttat ttaaaagaag gcaaacgacg tgctggtcaa gtcacgaatc 

37941 gttcaatgtt aacacttgat atcgattatg ctgctcaaga tatgactgac atattatcta tgttttatga 

38011 ttttgcatat tgtttatatt caacacataa gcatagagag ataagtccaa gactgcgttt agtgattcct 

38081 ttaaaacgaa atgtaaatgc agatgagtat gaagctattg ggcgtaaagt cgcagatatc gttggcatgg 

38151 attacttcga tgatacaact tatcaaccac ataggttaat gtattggcct tcaactagta acgatgcgga 

38221 atttttcttt acctatgaag atttaccttt gttagaccca gataaaatat taaatgaata tgttgattgg 

38291 actgacacat tagaatggcc aacgtcttca agggaagaga gtaagactaa aagattagca gataagcaag 

38361 gcgacccaga agaaaagccg ggaattgttg gtgcattttg tagagcctat acgatagaag aagctataga 

38431 aacttttatt cctgatttat acgaaaaaca ttctactaac cgttatacct atcatgaagg ttcaactgca 

38501 ggtggattgg tgttatacga aaataacaag tttgcctatt ctcatcataa tacggatccc gtaagcggta 

38571 tgcttgtgaa cagttttgat ttagtacgca tacacttata tggtgctcaa gatgaagacg ctaaaacaga 

38641 tactccggtt aatcgactac ctagttataa agcaatgcag caaagagcgc aaaatgatga agttgttaaa 

38711 aagcaattaa ttaacgacaa aatgtctgat gcaatgcagg atttcgatga aatagtaaat agcgatgatg 

38781 catggtctga gacgttagaa attacttcga aaggtacttt caaagctagt atcccaaata tagaaattat 

38851 attgcgtaat gatccaaatt taaaaggaaa aatagcattt aatgaattta caaaacaaat tgaatgctta 

38921 gggaaaatgc catggaataa taattttaaa atacgtcaat ggcaagacgg tgatgatagc agtttaagaa 

38991 gttatatcga aaagatttat gacatacacc attcaggcaa aacaaaagat gccattataa gcgtagcaat 

39061 gcaaaatgcc tatcatccag taagagatta tctaaataaa atatcgtggg atggacataa acgtcttgaa 

39131 aagttattta tcaaatactt aggtgttgaa gacactgaag tgaatagaac aactaccaaa aaggcattga 

39201 ctgctggaat cgctcgagta atggagccag gatgtaaatt tgactatatg cttacacttt atggtcctca 

39271 aggtgtaggt aaatctgctt tgctaaaaaa aataggtggt gcatggtttt ctgacagttt agtttctgtt 

39341 actggtaagg aagcatatga ggcattacaa ggcgtttggt taatggaaat ggcagaactt gcagctacaa 

39411 gaaaagctga agttgaagct attaagcatt tcatatctaa acaagttgac cggtttcgtg ttgcttatgg 

39481 acattatatt gaagattttc caaggcaatg tattttcatt ggtacaacta ataaagttga tttcttaaga 

39551 gatgaaactg gtggaagacg tttttggcca atgactgtaa atccagagag agttgaagtg aactggtcta 

39621 aactaaccaa agaagagatc gaccaaatct gggcagaagc taaatactat tatgaacaag gagaagagtt 

39691 gttccttaac cctgaactag aagaagaaat gcgttcaatc caaagtaaac atactgagga atctccatat 

39761 acaggtatta ttgatgaata tcttaacacg ccaatcccaa gcaattggga agacttaact atctttgaaa 

39831 gaagacgatt ttatcaaggt gatgttgata tgttaccaac aggaaatgta gattacattg aaagagacaa 

39901 ggtctgtgcg cttgaagtgt ttgttgaatg ttttggtaaa gataagggag atagtagagg atctatggaa 

39971 attagaaaga tttctaacgt cttaagacaa ttagacaatt ggtctgtata tgaaggcaat aaaagtggga 

40041 aaattcgatt tggaaaagat tatggtgtac agatagcgta tgtaagagat gaaagtttag aggatttaat 

40111 ataagaaata ttgaataaat atacattttt agatgttgta tcaaatgttg catcattttt tgagcgatgc 

40181 aacacggtgg tgtaaaaagt aatcgtaggt gttgtatcat ttttggtgat gcaacattga tgcaacaaat 

40251 gatacaacac ctctttccct tctcgctgta aggttcaacc ctgtttgttt ccaatgttgc atcaaattca 

40321 ctataaagtt taaaaagtag tgttagggag taaaggggta taggggtaac cctctaacag ctatttttaa 

40391 aagtttggca agaattgatg caacatcgga acacaaatat aaattttgta tacaaggtga ataaatgaaa 

40461 gaatcgacat tagaaaaata tttagtgaaa gagataacaa agttaaatgg attatgttta aaatgggtcg 

40531 cacctggaac aagaggtgta ccagatagaa ttattattat gccagaagga aaaacatatt ttgtagaaat 

40601 gaagcaagaa aagggaaagt tacatccttt acaaaaatat gtgcatcggc aatttgaaaa cagagatcat 

40671 acagtgtatg tgttatggaa taaagaacaa gtaaatactt ttataagaat ggtaggtgga acatttggcg 

40741 attgatttca aaccacatag ctatcaaaag tatgcaatag ataaagtgat tgataatgag aaatacggtt 

40811 tgtttttaga tatggggcta gggaaaacag tatcaacact tacagcattt agtgaattgc agttgttaga 

40881 cactaaaaaa atgttagtca tagcacctaa acaagttgct aaagatacat gggttgatga agttgataag 

40951 tggaaccatt taaatcatct gaaagtgtct ttagtcttag gaacacctaa agaaagaaat gatgcattaa 

41021 acacagaggc tgatatctat gtaaccaata aagaaaatac taaatggtta tgtgatcaat ataaaaaaga 

41091 atggccattt gacatggttg taattgatga actgtctaca tttaaaagtc ctaagagtca aaggtttaaa 

41161 tctattaaaa agaaattacc actcattaat agatttatag gattaacagg aacacctagt ccaaatagtt 

41231 tacaggattt atgggctcaa gtttatttga tagacagagg cgaaagactt gagtcttcat tcagtcgtta 

41301 tcgagaaagg tactttaaac caacacatca agttagcgaa catgttttta actgggagct aagagacgga 

41371 tctgaagaaa agatatatga acgaatagaa gatatatgtt taagcatgaa agcgaaagat tatctggata 

41441 tgcctgacag agttgatact aaacaaacag tagtcttatc tgaaaaagaa agaaaagtat atgaagaatt 

41511 agaaaaaaac tatattttag aatcggaaga agaaggaaca gttgtagctc agaatggggc atcattaagt^ 

41581 caaaaactac ttcaactatc taacggtgca gtttatacag atgatgaaga tgtaagactt atacatgata 

41651 agaagttaga taagttagag gaaattatag aggagtctca aggccaacca atattattgt tttataactt 

41721 caaacatgat aaagaaagaa tacttcaaag gtttaaggaa gcaaccacat tagaggattc aaactataaa 

41791 gaacgttgga atagtggaga cattaagctg cttatagcac atccagcaag tgcagggcat ggattaaact 

41861 tacaacaagg tgggcacatt attgtttggt ttggacttac atggtcattg gaattatacc aacaagcaaa 

41931 tgcaagatta tatagacaag gacaaaatca tacgactatt attcatcaca tcatgaccga taacacaata 
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42001 gatcaaagag tatataaagc tttacaaaat 

42071 caagaatagc taagcataag taatggaggt 

42141 atttaaatat attgaatcag aaatatataa 

42211 gagatactta acccaacgaa agaactagac 

42281 ttagaacaac tgagttaatg gcgacaaggt 

42351 tgaagcagtt gaaagtgagt acttaaagtt 

42421 aataaagaca agaagctaaa gatagaacaa 

42491 caatacgaaa gaactttgtt aaagcgatag 

42561 gcaaaaggcc tacaaatctg tagtaatatg 

42631 cgacataaat acatgaggca catcgctaag 

42701 tgaccaagca taataacatt tataagcatg 

42771 agcatggaag aagttaagag agatagcact 

42841 gatattataa cagatgcaaa gattgtgcat 

42911 acttagataa tctaatgtca gtttgttata 

42981 taatcttaag aaaattagag ttctaaaaat 

43051 tgcccatcgg cttaaaatgt tttttcgccg 



1S3 

aaagaactaa cgcaagaaga attgatgaaa gctattaaag 

ataagacggg aaaggcgtca tatgatatta agccaggaac 

tttaaatgag aacaagaaag agataaatag attgagaatg 

accaacattg tgtatggacc gttacaaaaa ggagagccag 

tattgactaa taagatgtta cgtaacttag aagagatggt 

acctgaagat cataagaaag taataaggtt aaagtattgg 

ataggggatg cttgtcacat gcatcgcaat acagttacta 

cgtatcatgc aggtatcaaa taacattgtg caaagattgt 

atagtatcgg aaagatgtat aaagttatct gaaagttata 

cggtgtgtct tttgttatgc aatcaaagag gtgtaagaga 

gtcgtaagtc atatcaatac gattggttct atcattcaaa 

agatagagat aattatcttt gtcaaatgtg tttacgcgaa 

cacattattt atgttgatga agatttcaac aaagctttag 

gctgtcataa caaaattcat gcaaatgata atgacaaaag 

ttaaataaaa aaattattta aataaaattt tatgcccccc 

ggtaccggag aggcc 
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Table 8 



Bacteriophage 3A ORFs list 



SID 


LAN 


FRA 


POS 


a . a. 


RBS soofuonco 




Piw 


100379 


3AORF001 


1 


8515 . .13486 


1657 


acagg t a cgga t c t aagaaa ac 1 1 1 


ttg 


taa 


100380 


3AORF002 


2 


37667 . .40114 


615 


tttaaaataatgaaaggagccgaac 


atg 


f- -J -3 

lad 


100381 


3AORP003 


1 


32188 . .34149 


653 


t t aaagaaac cgaggxgccaagaac 


ttg 


f an 

tag 


100382 


3AORF004 


3 


17457 . . 19370 


637 


gctattttattagaaaggaaggtgc 


att 


*• aa 


100383 


3AORP005 


1 


334 . .2034 


cct 
5oo 


agaaaaaaga u age tcaayaagoay 


gtg 


t-aa 
taa 


100384 


3AORF006 


1 


15571 . . 17154 


527 


cctttacT. tacaggcaggcgatcca 


atg 


#• aa 
baa 


100385 


3AORP007 


2 


19337 . .20836 


499 


atgatagtaaaacaagttcagggcc 


atg 


^ aa 
taa 


100386 


3AORF008 


3 


22176 . .23630 


484 


aa t ga 1 1 t agggt aggtgt t gac ca 


atg 


tga 


100387 


3AORF009 


1 


40726 . .42093 


455 


gtaaatacttttataagaatggtag 


gtg 


taa 


100388 


3AORF010 


3 


13491. .14738 


415 


gaggeggae t aacgc t acagt aaaa 


att 


taa 


100389 


3AORF011 


2 


2039. .3277 


412 


attaaagacataatgegttaaggag 


gtg 


taa 


100390 


3AORF012 


2 


4001. .5209 


402 


aaaaaagagaaaaaattaaacgega 


atg 


taa 


100391 


3AORF013 


1 


30379. .31545 


388 


at 1 1 1 atgaatgcgagaat aaatgc 


atg 


taa 


100392 


3AORF014 


2 


14738. .15562 


274 


a t t a t a t gggaggt t t gac t aat t a 


atg 


tag 


100393 


3AORF015 


3 


3249. .4034 


261 


c 1 1 gaat t aagaaaa t c 1 1 1 gaaag 


gtg 


tag 


100394 


3AORF016 


-2 


25587. .26273 


228 


aagaagct aagaaaaaaat aaaaat 


atg 


tga 


100395 


3AORF017 


3 


6729. .7370 


213 


ttaattttt aaggaggaaat aagc a 


atg 


taa 


100396 


3AORF018 


3 


24540. .25154 


204 


aat aaaat aaaaagt aggtgat aag 


atg 


taa 


100397 


3AORF019 


2 


31565. .32128 


187 


c t at aaaaat t aaaaaggacggt at 


ata 


taa 


100398 


3AORF020 


3 


36150- .36713 


187 


geagtaggaattatgaegggtcaag 


ttg 


taa 


100399 


3AORF021 


2 


24011. .24535 


174 


gtaataaaatttataaagaaaggaa 


atg 


tga 


100400 


3AORF022 


-2 


12423. .12938 


171 


taaagtaccagtagacaatgtaggt 


att 


tga 


100401 


3AORF023 


1 


7462. .7917 


151 


aaaat aaat caaaggagaat aat 1 1 


atg 


taa 


100402 


3AORF024 


1 


26731. .27174 


147 


actaaataaaaataaggaggacact 


atg 


tga 


100403 


3AORF025 


1 


42106. .42543 


145 


taagcataagtaatggaggtataag 


atg 


taa 


100404 


3AORF026 


2 


35255. .35671 


138 


aagcaactaactttattttaaggag 


ata 


taa 


100405 


3AORF027 


2 


5888. .6298 


136 


atattggctataatacagtggtttt 


ate 


taa 


100406 


3AORF028 


-3 


27845. .28255 


136 


ccttttaagatgtttatgatccttt 


ctg 


taa 


100407 


3AORF029 


3 


34344. .34748 


134 


ttaaggttttagatttagaggtgga 


atg 


taa 


100408 


3AORF030 


2 


6299. .6694 


131 


t ataaaaaaggagt tggecagat aa 


atg 


tag 


100409 


3AORF031 


1 


20833. .21225 


130 


ttaacaaaattataggagtgagaaa 


ata 


taa 


100410 


3AORF032 


-2 


39984. .40361 


125 


aaatagctgttagagggttacccct 


ata 


tag 


100411 


3AORF033 


1 


7957. .8325 


122 


gaatatctgcgtcttttttatttga 


ata 


taa 


100412 


3AORF034 


-2 


28506. .28871 


121 


gt t a t caacc t aaggaggtgat aac 


atg 


tag 


100413 


3AORF035 


-2 


10671. .11036 


121 


tcctagcttcctaacagcaccgcca 


ata 


tga 


100414 


3AORF036 


2 


30020. .30382 


120 


accaattttaaggaggagttaatca 


atg 


tga 


100415 


3AORF037 


2 


21818 . .22165 


115 


aagtgtaagtaatagttaagagtca 


gtg 


tag 


100416 


3AORF038 


-2 


42003. .42347 


114 


gtactcactttcaactgcttcaacc 


ate 


tga 


100417 


3AORF039 


2 


21386. .21727 


113 


t ccagaaaat c tagag t cat aggt t 


ata 


taa 


100418 


3AORF040 


-3 


29654 . .29995 


113 


ttgattaactcctcctt aaaat t gg 


ttg 


taa 


100419 


3AORF041 


-1 


4333. .4671 


112 


tactaaatctacatctgatccatga 


att 


tga 


100420 


3AORF042 


3 


5568. .5900 


110 


taaaaaagtggtaggtgatttttaa 


atg 


tga 


100421 


3AORF043 


1 


25690. .26019 


109 


taccaaattaatatagtcttcgcat 


ata 


tag 


100422 


3AUKF 044 


3 


Z7D/Oi ■ JUUU9 




afc efc f.aaataattat at aaooaot t 


att 


taa 


100423 


3AORF045 


3 


30. .353 


107 


cgctagcaacgcggataaatttttc 


atg 


taa 


100424 


3AORF046 


3 


27894. .28214 


106 


aagatattgaaaagctaatttcccc 


ata 


tga 


100425 


3AORF047 


-2 


11907. .12227 


106 


ttcgccgccaaaatgattagcattt 


ctg 


tga 


100426 


3AORF048 


-3 


40343. .40663 


106 


ccataacacatacactgtatgatct 


ctg 


taa 


100427 


3AORF049 


-3 


6749. .7069 


106 


tgttaaaccatcttcagattctcca 


ata 


taa 


100428 


3AORF050 


1 


42700. .43014 


104 


ttatgcaatcaaagaggtgtaagag 


atg 


taa 


100429 


3AORF051 


-2 


13077. .13388 


103 


1 1 g t acg t aa t cccacaca t cgccg 


att 


tga 


100430 


3AORF052 


-3 


3722. .4024 


100 


gcatttcatttcctcctaataactc 


att 


tga 


100431 


3AORF053 


3 


17145. .17444 


99 


tcgagacaatggatatagggagtgt 


att 


tag 


100432 


3AORF054 


-1 


19915.. 20211 . 


98 


ataatttatagcttgegaaacataa 


ata 


tga 


100433 


3AORF055 


-1 


42436. .42729 


97 


aatcgtattgatatgacttacgacc - 


atg 


-tag 


100434 


3AORF056 


3 


40455. .40745 


96 


t aaat 1 1 1 gt at acaaggtgaat aa 


*pg 


tga 


100435 


3AORF057 


-1 


38665. .38952 


95 


atcatcaccgtcttgccattgacgt 


att 


taa 


100436 


3AORF058 


-1 


21265. .21549 


94 


gaaatttctatctaacttgtcataa 


att 


tga 


100437 


3AORF059 


-2 


10278. .10562 


94 


tttagccgcgcttccaactgcacgt 


att 


tag 


100438 


3AORF060 


1 


5278.. 5556 


92 


atatcagccgaataggggtgatgaa 


atg 


tag 


100439 


3AORF061 


1 


35668. .35946 


92 


1 1 1 ggaaagaaggagagt t gat t aa 


ata 


taa 


100440 


3AORF062 


2 


35912.. 36187 


91 


gttaaatttggaatggaattaaaca 


ata 


taa 
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100441 


3AORF063 


3 


36720 . . Jo 995 


91 


cyyaay t aycyyay uy kaaayata l, 


at t 


cga 


100442 


3AORF064 


-2 


35694 . .35969 


91 


r*r^rt~ a t* a r^nrrifl' arjr , Ar , f*AAt~A,a 


ctg 




100443 


3AORF065 


-2 


32697 . . 32972 


91 


AAr»r , «trt" trr ntr f" taf aaaft aaat* 


sta 


taa 


100444 


3AORF066 


3 


29157 . .29429 


90 


r*?% a 1- t"aar.at"ft*at"r»#*aa a on a 
CaaaCtLLaaCaLLtaLLCaaayyd 


gtg 


tag 


100445 


3AORF067 


-2 


266ol. .2074U 


Q 0 


afarfhffftaflroQaatrocrat'oa 


t-t-rj 

Ul -y 


taa 


100446 


3AORF06 8 


-2 
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3AORF226 


-2 


17184. .17312 


42 


cgcctatttttaaattatctaattt 


att 


tag 


100605 


3AORF227 


-2 


1425. .1553 


42 


atcttcttcccattctctatagggt 


att 


taa 


100606 


3AORF228 


-3 


31055 . .31183 


42 


cattttttgatgtcaggcagtttat 


ata 


taa 


100607 


3AORF229 


-3 


22592. .22720 


42 


gttataaccatgaccggctacaagc 


ata 


taa 


100608 


3AORF230 


-1 


27883. .28008 


41 


gaaggcagggt cgt t t ct tggat t a 


ttg 


tag 


100609 


3AORF231 


-2 


29968. .30113 


41 


gcttctttaactttctcttgtacaa 


ttg 


taa 


100610 


3AORF232 


-2 


22485. .22610 


41 


tatctgggaaatttaatctaataaa 


ata 


tag 


100611 


3AORF233 


-2 


9264 . .9389 


41 


aagtttgccgaaatgactttgagct 


ate 


tga 


100612 


. 3AORF234 


-3 


23033. .23158 


41 


acctaattcagataagcgataattt 


ata 


tga 


100613 


3AORF235 


1 


25558. .25680 


40 


aacactgctgaaatagacgtctttt 


ata 


tag 


100614 


3AORF236 


1 


34420. .34542 


40 


acattgagagaagtttcagaaaaat 


ate 


taa 


100615 


3AORF237 


3 


38442. .38564 


40 


gaagaagctatagaaacttttattc 


ctg 


taa 


100616 


3AORF238 


-1 


33628. .33750 


40 


caat cat tagaaaaccttttt teat 


ata 


taa 


100617 


3AORF239 


-1 


29248. .29370 


40 


tcttctaatttagaaatattaatca 


atg 


tag 


100618 


3AORF240 


-2 


18156. .18278 


40 


gtctctcaattctgtatagaatttt 


att 


taa 


100619 


3AORF241 


-2 


8088. .6210 


40 


tttcaaggcttttgtataagtttta 


gtg 


tga 


100620 


3AORF242 


-3 


39149. .39271 


40 


ttagcaaagcagatttacctacacc 


ttg 


taa 


100621 


3AORF243 


-3 


23558. .23680 


40 


aaaattaactgtttattaattttaa 


ata 


taa 


100622 


3AORF244 


-3 


1697. .1819 


40 


catttcattaaaggattattattaa 


ata 


tga 


100623 


3AORF245 


1 


19015. .19134 


39 


agttatgcaaggaatatgatgactt 


ttg 


tag 


100624 


3AORF246 


1 


22504. .22623 


39 


gctaatctaaacactttcacatcgt 


ttg 


taa 


100625 


3AORF247 


-1 


40567. .40686 


39 


aaagtatttacttgttctttattcc 


ata 


taa 


100626 


3AORF248 


-1 


23956. .24075 


39 


tttagattcatgaaacgaagtagca 


ata 


taa 


100627 


3AORF249 


-1 


11113. .11232 


39 


cacctttccccaacacttttacagt 


ate 


tga 


100628 


3AORF250 


-1 


8719. .8838 


39 


ttttattagcttctactagctttaa 


ata 


taa 


100629 


3AORF251 


-2 


16899. .17018 


39 


aact cgt ctgt t aagcgcttgt t ga 


att 


tga 


100630 


3AORF252 


-3 


37025. .37144 


39 


acaactgccctaatttaataactgc 


att 


tga 


100631 


3AORF253 


-3 


29138. .29257 


39 


tctacatactccaaacaattgatgg 


att 


taa 


100632 


3AORP254 


-3 


15476. .15595 


39 


caaatcaattcattaaaatccatta 


ctg 


taa 


100633 


3AORF255 


1 


13552. .13668 


38 


ttaatagacaaagtaaaatcgtggt 


ttg 


tag 


100634 


3AORF256 


2 


12545.. 12661 


38 


aaaagtgcaaagggctggctaacgg 


ata 


taa 


100635 


3AORF257 


2 


41870. .41986 


38 


gggcatggattaaacttacaacaag 


gtg 


tga 


100636 


3AORF258 


3 


10827. .10943 


38 


t caaact t t tgaaaaacggt t t agg 


att 


taa 


100637 


3AORF259 


-1 


34570. .34686 


38 


gtgacat cgaaccagt aeggatcac 


gtg 


tga 


100638 


3AORF260 


-1 


32389. .32505 


38 


aagcaggtaagccaatacgcattga 


att 


tag 


100639 


3AORF261 


-1 


23830. .23946 


38 


cctttttaacttttaataaaattaa 


ata 


tga 


100640 


3AORF262 


-1 


6158. .8274 


38 


ccat c t c 1 1 ctggt t cagttt ctga 


ate 


taa 


100641 


3AORF263 


-2 


14001. .14117 


38 


ttatacctgcatttcctcctgattc 


gtg 


tga 


100642 


3AORF264 


-2 


294 . .410 


38 


tttgcttgtttttattttcccttga 


gtg 


taa 


100643 


3AORF265 


-3 


42683. .42799 


38 


tgacaaagataattatctctatcta 


atg 


tga 


100644 


3AORF266 


-3 


31979. .32095 


38 


aatcctcatcatcagtgtctaattc 


ate 


taa 


100645 


3AORF267 


-3 


26306. .26422 


38 


Ctgtaacaacttgatttaagaatac 


ate 


tga 


100646 


3AORF268 


-3 


16490. .16606 


38 


tacatacaaggcttagcttttttat i 


ttg 


tag 


100647 


3AORF269 


-3 


9872,. 9988 


38 


tgagacccctctaaccctgagttag 


ata 


tag 


100648 


3AORF270 


1 


21829 . . 21342 


J / 


at ctg 1 fc clog ay t Cay t y u i» v^y y wa 


ctg 


fan 
wdy 


100649 


3AORF271 


2 


29466.. 29581 


37 


tgagcgacacatataaaagctacct 


att 


taa. .. 


100650 


3AORF272 


3 


2955. .3068 


37 


gagttaaacagattttacttgcagc 


ata — " 


~taa 


100651 


3AORF273 


3 


5010. .5123 


37 


tttggcaaaccagtagtatttacag 


ac§ 


taa 


100652 


3AORF274 


3 


19956. .20069 


37 


tcaagtatagatgaattaaagcaac 


ttg 


tga 


100653 


3AORF275 


3 


39882. .39995 


37 


gatatgttaccaacaggaaatgtag 


att 


taa 


100654 


3AORF276 


-1 


27211. .27324 


37 


attaagtgegcttatttaattagat 


att 


tga 


100655 


3AORF277 


-1 


13516. .13629 


37 


cgaccgtcattaaagttaagtccac 


ctg 


tga 


100656 


3AORF278 


-1 


11893. .12006 


37 


ttttatatacacgaccactggataa 


ate 


taa 
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100657 


3AORP279 


-2 


17535. .17648 


37 


tttgtaaagatttgtttactgctgc 


ttg 


taa 


100658 


3AORF280 


-2 


6474. .6587 


37 


tcaaaataagcatctaactgactag 


atg 


taa 


100659 


3AORF281 


-2 


759. .872 


37 


ttttgatatcgttgcgtcataatgg 


att 


tga 


100660 


3AORF282 


-3 


36608. .36721 


37 


cccaaaacctccttgactcgatcta 


ata 


tga 


100661 


3AORF283 


-3 


14960. .15073 


37 


tttcagttgaagaaccatcttttaa 


att 


taa 


100662 


3AORF284 


1 


18859. .18969 


36 


atgttaacagagccaggtctttact 


att 


taa 


100663 


3AORF285 


2 


8237. .8347 


36 


aaaacttatacaaaagccttgaaag 


ata 


taa 


100664 


3AORF286 


3 


5157. .5267 


36 


tatgatcagcaacgtacattagaca 


gtg 


tag 


100665 


3AORF287 j 


3 


38610. .38720 


36 


tttgatttagtacgcatacacttat 


atg 


taa 


100666 


3AORF288 


-1 


36454. .36564 


36 


tttatgacataactaccattcatac 


ata 


tga 


100667 


3AORF289 


-1 


30217. .30327 


36 


aacaattttttcataatgctcttct 


ttg 


taa 


100668 


3AORF290 


-1 


16678. .16788 


36 


get 1 1 1 1 Cgcaa at t c t aacagct t 


ate 




100669 


3AORF291 


-2 


14310. .14420 


36 


gtct agt t aaaggga t aaccat etc 


ctg 


tga 


100670 


3AORF292 


-2 


11457. .11567 


36 


ttctttcaattctttgattttctga 


ttg 


tga 


100671 


3AORF293 


-3 


29462.-29572 


36 


ttcataaaagtattccttataaaat 


atg 


tag 


100672 


3AORF294 


-3 


22388. .22498 


36 


accattccaattttggccaaacgat 


gtg 


tag 


100673 


3AORF295 


-3 


18629. .18739 


36 


aaaaggaacgcctcttgagtgaagt 


att 


tag 


100674 


3AORF296 


-3 


6332.. 6442 


36 


tatcagacatgaagtctgaaggtaa 


ate 


taa 


100675 


3AORF297 


1 


13984. .14091 


35 


aaatggttgaagtcacttaaaggta 


gtg 


tag 


100676 


3AORF298 


1 


40174. .40281 


35 


tatcaaatgttgcatcattttttga 


gtg 


taa 


100677 


3AORP299 


2 


1481. .1588 


35 


gccgcgtgtgctacttttgcgttag 


ata 


taa 


100678 


3AORP300 


2 


40451. .40558 


35 


aatataaattttgtatacaaggtga 


ata 


tag 


100679 


3AORF301 


3 


25479. .25586 


35 


accactagttaaaacttcatatact 


ata 


taa 


100680 


3AORP302 


3 


32106.. 32213 


35 


gaagatgatttcgatgaattagaca 


ctg 


tga 


100681 


3AORF303 


3 


36024. .36131 


35 


gacacagagggat tatt aaaagaga 


ttg 


tag 


1006B2 


3AORF304 


-1 


37762. .37869 


35 


accgacaaatccgccaacatctttt 


ata 


tga 


100683 


3AORP305 


-1 


24088. .24195 


35 


tttatctttaacaaaatcaaactga 


ata 


tga 


100684 


3AORF306 


-1 


19507. .19614 


35 


atcattaggtaattgaaattttaaa 


ata 


tga 


100685 


3AORF307 


-1 


16081. .16188 


35 


atgtactgacagttgcagatacagt 


ate 


tag 


100686 


3AORF308 


-1 


11398. .11505 


35 


tttctttagttctagttaaaatgtt 


ttg 


taa 


100687 


3AORP309 


-2 


33003. .33110 


35 


aaacagacctcttacccgttcatca 


ctg 


taa 


100688 


3AORF310 


-2 


24894. .25001 


35 


gtaaatcgaaatcgctaccagctga 


att 


taa 


100689 


3AORF311 


-2 


22005. .22112 


35 


ttcgtaggtgtcattacttctttaa 


ttg 


tag 


100690 


3AORF312 


-2 


21711. .21818 


35 


aaaataaaaagccagtgccgaagca 


ctg 


tag 


100691 


3AORF313 


-2 


17901. .18008 


35 


cattaggtcttagacgacttagcat 


ata 


taa 


100692 


3A0RF314 


-2 


16710. .16817 


35 


taattcagtcttaggagtatcattt 


att 


tag 


100693 


3AORF315 


-2 


15990. .16097 


35 


acatatctccgtatcatttgggtaa 


att 


tag 


100694 


3AORF316 


-2 


2862. .2969 


35 


aatt c 1 1 ct t cat actgt t tgacga 


ttg 


tag 


100695 


3AORF317 


-3 


40217. .40324 


35 


tccctaacactactttttaaacttt 


ata 


tga 


100696 


3AORF318 


-3 


37535. .37642 


35 


tgttcggctcctttcattattttaa 


ata 


taa 


100697 


3AORF319 


-3 


34421. .34528 


35 


ttcttcatcttttatttgactctgc 


ata 


tga 


100698 


3AORF320 


-3 


28262. .28369 


35 


catttgttggtaatatcttagttcg 


atg 


tga 


100699 


3AORF321 


1 


23989. .24093 


34 


taaaaaggtttaatataaaaatgta 


ata 


tga 


100700 


3A0RF322 


1 


34660. .34764 


34 


aagagaagattgagaccatggcttt 


atg 


taa 


100701 


3AORF323 


3 


30105. .30209 


34 


c taaat actgaact at caac tgt ag 


att 


taa 


100702 


3AORF324 


3 


30258. .30362 


34 


ggaaaagagttccttaaaaaagcag 


ata 


tga 


100703 


3AORF325 


3 


40236. .40340 


34 


gttgtatcatttttggtgatgcaac 


att 


tag 


100704 


3AORF326 


-1 


36964. .37068 


34 


cgcatcaacaactgtaaacctttga 


ttg 


tga 


100705 


3AORF327 


-1 


35242. .35346 


34 


atttttgtctgttgtataatatttt 


ctg 


taa 


100706 


3AORF328 


-1 


21916.. 22020 


34 


ccatttaccttcttgagatgttgga 


ttg 


tga 


100707 


3A0RF329 


-1 


16820. .18924 


34 


ggtggcttaacttccaagaaccaac 


ctg 


taa 


100708 


3AORF330 


-1 


15631. .15735 


34 


ttatgaagttttcacaaattagtaa 


ate 


tag 


100709 


3AORF331 


-2 


37998. .38102 


34 


1 1 acgcccaat agct teat act cat 


ctg 


tag 


100710 


3AORF332 


-2 


7359. .7463 


34 


tttataaacctttaaagttttagtc 


ata 


taa 


100711 


3A0RF333 


-3 


24584 . .24688 


34 


aaaaattataaaactataaaaccat 


ate 


taa 


100712 


3A0RF334 


-3 


24269. .24373 


34 


tatttttaggtagataatttattaa 


ate 


tga 


100713 


3A0RF335 


-3 


14273. .14377 


34 


cacttcagcaagttgatgctttgta 


ate 


tga 


100714 


3AORF336 


2 


7559. .7660 


33 


gtaactttatctaatttagaagegg 


ata 


tag 


100715 


3AORF337 


2 


13277. .13378 


33 


aat at aggt aaaaaagcaggagaat 


ttg 


tag 


100716 


3AORF338 


3 


9501. .9602 


33 


taggacgtacgatgacgatgggcgt 


ate 


taa 


100717 


3AORF339 


3 


27348. .27449 


33 


atatctaattaaataagcgcactta 


att 


tga 


100718 


3AORF340 


-1 


37372.-37473 


33 


ttctatggttttcatcttatgagaa 


atg 


taa 


100719 


3AORF341 


- 1 


33421.. 33522 


33 


aagctaattcggacacttttccttt 


ttg 


taa 


100720 


3AORF342 




29047 . . 29148 


J 3 


*- f. +-nnr*a t* <"t- rfaf cart rrf ^ t" a a 


ata 


taa 


100721 


3A0RF343 




7549.. 7650 


33 


atgatacgcctgagactagaattgg 


att.-. 


taa 


100722 


3AORF344 




7297. .7398 


33 


ctgctgaaactgttgcagattttga 


att-- 




100723 


3AORF345 


-2 


23850. .23951 


33 


ttaaacctttttaacttttaataaa 


art 


taa 


100724 


3A0RF346 


-2 


20607. .20708 


33 


aaagatgtacgactagatttagtta 


ate 


taa 


100725 


3AORF347 


-2 


14175. .14276 


33 


atctgttgttaaagaacgctaataa 


ctg 


taa 


100726 


3AORF348 


-2 


6984. .7085 


33 


cgt acact ggt tgacctgt t aaacc 


ate 


tag 


100727 


3AORF349 


-2 


6882. .6983 


33 


tagaacgaccaataactgtatttag 


ate 


taa 


100728 


3AORF350 


-3 


40748. .40849 


33 


aactgcaattcactaaatgctgtaa 


gtg 


tga 
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100729 


3AORF351 


-3 


38345. .38446 


33 


ggttagtagaatgtttttcgtataa 


ate 


taa 


100730 


3AORF352 


-3 


38081. .38182 


33 


tagttgaaggccaatacattaacct 


atg 


taa 


100731 


3AORP353 


-3 


35432. .35533 


33 


tagcattctcatatgatgcagattt 


aca 


taa 


100732 


3AORF354 


-3 


349S2. .35053 


33 


ttatcctgatacagatatctcttag 


ate 


taa 
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Table 9 



Bacteriophage 96, complete genome sequence 

1 catagttata ggcttttcag ctatatacca agataagatt tatcccgccg tctccataaa aatatgcttg 

71 gaaaccttga tttaatgggg ttttaatcta gcaagtgtca aatatgtgtc aagaaaataa ttttctgaca 

141 cgttgacctt gctctttttt atgttcatca agtaagtgag agtaggtgtc taaagttata gatatattat 

211 aatggcctaa tcttttgcta atatattcaa taggtatacc tttagaaagt aggaaagatg tatgcgtgtg 

281 tcttaatgaa taaggtgtta ttgtagtatc atttagtcct atttgactct tagcatggtt aaatgacttt 

351 ttaacggcat tatgactcaa tttaaacaac ttattatctg tacgttttgg taattttgat aatttagctt 

421 taatatgttg tatatccttt tttggtacct ccacaagtct gtccgcgtta actgtttttg ttccacgaag 

491 atgtattgta ccctcttttt cgtttagatc gataggcaac atattaatta catcgctgta tcttgcacca 

561 gtgatagcta ggatgaataa aaaaatataa ctcgattcgt ctctagattt aaagtattct atcaattgca 

631 agtattgttc tatggtgatg aatttagagt gttcgtcttt tgattttttt gtaccacgaa tatctatttg 

701 atagctaggg tctttcttta aatagccctc atatactgca tctctgaagc attgtgataa acaactgttt 

771 aatttacgaa ccgtttcatt agtacgacct cgaccgaatt cgttcaaaaa cttttgatac tccgaacgtt 

841 tgatgttttt tattaaaaaa tcactcccga aatattcgtt aaataatttt aatgaacgtt gataccaata 

911 gaattgttgt gaagcgacat gtttcttatt ttttgaatct aaccaatcat tgtaatattc ttcaaacttt 

981 ttattttcat ctaaattgtt tccatcatcc aaatctctaa gcagttgttg agcagcgttg gttgcctcag 

1051 ctttagtttt gaatcctgac tttcttttct ttcctgattt gaaagacgga tgttttacgt cgtactgcca 

1121 agatgctgtt gctttattct tcctttttgt aattgtaaat gacgccattt tacttttcct cctcaaaatt 

1191 ggcaaaaaat aataagggta ggcgagctac ccgaaatttt attgttgaac aactattgct tcacttcttg 

1261 cttttcctac ttcttttcta aaactatcat atgattgatt agggtgtgtt aacgacattc ctggaccacc 

1331 tccagcatgt tggtttttgt ccggattatt ttccatttct tcagtggctc ttttagcatt taaatattct 

1401 tcgtaactag gttcgtttgg gtcgcgtggt tgtgcttgtt gtccattatt ggtagctgga agattcttct 

1471 gtacctgttg cttagatgtg ttattggttt gttgattgtt gttaatgttt gtgttgttct cgttgtttac 

1541 ttgattattg ttatcgtttt gattactatt ttcttttttc gcttctgctt tatctttagt ttctttcttt 

1611 ttgtctttgt tctctttctt tgtttcggtt ttcttgcttt cctctttctt atcgccgtcg ttgctaccgc 

1681 atgcacctaa cactaacgca ctagctaata ataaaactaa taatcttttc atgttttaca ctcctttatt 

1751 tgctatttgt tttaataaat ctatgatttc attgttttgt tctatgattt tgttttcatt tttaagatgt 

1821 tcgtctaaca tctctattaa gacgaaattt tgatttatca tttcgtaagt aaacatttga cctgtgttgt 

1891 taggattaga aaacgaacta ctgaaacgcg ttgaaaagct atctataaat tgaccaactt tattttttaa 

1961 taacatatct ttaccgctct cagacattgt atttagttcg cgcttattta aagttttttc tataattttg 

2031 tattttgttt cctgatttct ttcgatttct tctacttcaa aagggatatt gttattaaat ttttcgataa 

2101 tatcacgttt ttcagaaact gacatacgat caaatacttg tttttgacct ttatttaact tccctcgaat 

2171 ttttccggca gtccaagact ctttaactgt taacttatca ttaggaactt gattcatctt ttatatgact 

2241 ccttttctca tatttcttta tatttaaaaa ctctcaacgg ctcaaatgta atcgaatact cgccatagtg 

2311 agttccaata ccgtatatct tcttatattg ttctattgcc tccaatatgt attcttcgct taattgtaga 

2381 tactcagaca actcatacaa gttacgtacg ccataattgt aagcttctac aatttcgcgt aacgggactg 

2451 ctgagataaa gccgtgtcgt cttgcgtaat tttcgaactt gcgattgttg aatttcgatt gatctaaaat 

2521 gttgccatac gtcaacttgt ggtgggcaag ttcttcatat aatacttcta atttgttcct ttcggataag 

2591 gaaggtctaa taaaaatttc tccttcttga taccaaccat cgaatcctcg aggtactctt tgtgtttctt 

2661 tcacttcaac ttcacatttc ataagcaatt cttcgtattt tcccatgcgc caaacccctt tggtgtctta 

2731 tttctttcta tctctaaccc attgcataaa attttcgatt tcttcccatt cttcgggagt aaattcatct 

2801 ttatttgcat gaccggctat agtttcttga tgaatacttc tttcttctgt aattctcgat ttaggtacat 

2871 taaagtaatc tgctaattgt tggacttttg atattctagg atatttaagt tctttaagcc agttagagat 

2941 tgttgattga cttaccccga ttgcttcaga caattctact tgagtaatgt tgttctcttt cataagttgt 

3011 tctaagttct ctgataaaat ttttctagca ctcttatatt ccataatttt ctcctttagt attacttaat 

3081 gtaatactaa tttaccataa gtaatatcac ttttcaatac aaaatattac ttttttgaaa taaatatcac 

3151 tttaggtgtt gacatattac tttaagtgat agtatagttg taaatgtcaa cgggaggtga tacgaaatgc 

3221 cagaaaattt taaagagttc tctgtaaagg tctggagaac taattcgaat atgacacaac aagatgtcgc 

3291 tgataaatta ggcgttacta aacaatctgt aataagatgg gaaaaagatg acgcagaatt aaaaggctta 

3361 caattgtatg ctttagccaa attattcaac acagaagttg attatataaa ggctaaaaaa atttaacatt 

3431 aatatcactt taagtgataa aggaggaaac tgaaatgcaa gaattacaaa catttaattt tgaagaatta 

3501 ccagtaagga aaattgaagt ggaaggagaa cccttctttt taggtaagga tgttgctgaa attttagggt 

3571 atgcacgagc agataacgcc atacgcaatc atgttgatag tgaagatagg ctgatgcacc aaattagtgc 

3641 gtcaggtcaa aacagaaata tgatcatcat caacgaatct ggattataca gtttaatctt tgacgcttct 

3711 aaacaaagta aaaacgaaaa cattagagaa accgctagga aattcaaacg ctgggtaact tcggaagttt 

3781 taccgacgtt aagaaaaact ggtgcttacc aagtacctag tgacccaatg caagcattga gattaatgtt 

3851 tgaagctaca gaagaaacaa aacaagaaat taaaaacgtg aaagatgatg ttattgattt gaaagaaaat 

3921 caaaaactgg atgcgggaga ctacaatttc ttaactagaa caatcaatca aagagtagct catatacaaa 

3991 gactacatgc gataacaaac caaaaacaac gtagcgaatt attcagggat attaattcag aagtgaaaaa 

4061 gatgactggt gcgagttcaa gaacgaacgt aagacaaaaa catttcgacg atgtaattga aatgattgct 

4131 aattggttcc cgtcacaagc tactttatac agaatcaagc aaattgaaat gaaattttaa aacgaaatat 

4201 aggagaggct gaatatggaa tacatcggat atgcagacgc aaatgcgttt gtaaaaataa gtggcatttc 

4271 aaaagatgat ctagagaaaa aagtctactc gaacaaagag tttcaaaaag aatgcatgta cagatttggt^ 

4341 cgaggacaaa agcgttatat aaaaattgac aaagctattc aatttatcgg taccaattta atgattaatg 

4411 aatacgaatt ataggaggag ttatcaaatg agtaaaactt ataaaagcta cctagtagca gtactatgct 

4481 tcacagtctt agcgattgta cttatgccgt ttctatactt cactacagcg tggtcaattg caggattcgc 

4551 aagtatcgca acattcatat actacaaaga atacttttat gaagaataaa aaaactgcta cttgcgtcaa 

4621 caagtaacag tgacaaacat ttatcaaaat atacaactta attaaatcaa aatatacgga ggtagtcaac 

4691 tatggctgaa aatattaaaa ctgaacaaca ttattacact aaagatttct caggatacag aaacgaagaa 
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4761 gataactttg tagcaaatca agaattgaca gcaacaatca cattgaacga gtacagaaaa cttattgaaa 

4831 taaaggctgt taaagataaa gaagaagaca cctacagagg taagtatttt gcggaagaaa gaaaaaacga 

4 901 aaaattggaa aaagaaaata taaaactaaa aaacaaaatt tatgaattac aaaacgaaga agataacgag 

4 971 gaggacgaag aagacaagga ggacgagaac gatgtattac aaaattggtg agataaaaaa caaaattata 

5041 agctttaacg ggtttgaatt taaagtgtct gtgatgaaga gacatgacgg tatcagtata caaatcaagg 

5111 atatgaataa tgttccactt aaatcgtttc atgtcataga tttaagcgaa ctatatattg cgacggatgc 

5181 aatgcgtgac gttataaacg aatggattga aaataacaca gatgaacagg acaaactaat taacttagtc 

5251 atgaaatggt aggaggtatg aaaagtgaat gatttacaag agagagaatt agaaacattc gaacaagacg 

5321 accgattcaa agtaactgat ccagacagtg ctaactgggt ttttaagaaa ctggatgcaa tcacaactaa 

5391 agagaatgaa atcaacgatt tagcaaataa agaaattgaa cgcataaacg aatggaaaga taaagaagta 

5461 gaaaaattac agagtggcaa agaatattta caaagccttg taattgaata ttacagaata caaaaagaac 

5531 aagatagcaa attcaagttg aatacacctt acggaaaagt gacagccaga aaaggttcaa aagtcattca 

5601 agttagcaat gagcaagaag tcattaaaca acttgagcaa cgaggttttg acaactatgt aaaagtaact 

5671 aaaaaactta gccaatcaga cattaagaaa gatttcaatg taactgaaaa cggcacattg attgacgcaa 

5741 acggcgaagt tttagagggt gctagcattg tggagaaacc aacgtcatac acggtaaagg tgggagaata 

5811 gacgactgaa aaaactaatc aagatgtcga tattttaacg caactaggtg taaaagacat cagcaaacaa 

5881 aatgcaaaca agttttataa atttgcgata tacggcaagt tcggtactgg taaaactacg tttttaacaa 

5951 aagataacaa taccttagta ctagatataa atgaggacgg aacaacggta acagaagatg gggcagttgt 

6021 gcagattaag aattataagc attttagtgc agtgattaaa atgctgccta aaattattga acaactaaga 

6091 gaaaacggaa aacaaattga tgttgtagtg actgaaacaa tccaaaagtt acgtgatatc actatggacg 

6161 acatcatgga cggtaaatca aagaaaccga catttaatga ttggggcgag tgtgctacac gcattgtaag 

6231 tatttatcgt catatttcta aattacaaga acattatcaa tttcatcttg ctataagcgg acacgagggc 

6301 attaacaaag acaaagatga tgagggaagt actatcaatc caacaatcac gatagaggca caagaccaaa 

6371 taaaaaaagc agtcatcagt caatctgacg tgttagcaag aatgacaata gaagaacatg agcaagacgg 

6441 cgaaaaaact catcaatatg tacttaacgc tgaaccatca aatttattcg agacaaagat aagacactca 

6511 agcaacatca aaattaacaa caaacgtttc attaatccaa gtattaacga tgttgtacaa gcaattagaa 

6581 atggtaatta aaaattaatt aaaaggacgg tataaaaatt atgaaaatca ctggtagaac acaatacatt 

6651 caagaaacta atcaagaggc attcatgaaa ggtggggact ttttaggagc tggagaattt acagtaaaag 

6721 ttgcaaatgc cgagtttaac gacagagaaa acagatactt cacgattgtt tttgaaaaca acgaaggtaa 

6791 acaatacaaa cacaaccaat tcgtcccacc attccaacaa gattatcaag aaaaacaata tatcgagtta 

6861 cttagtagat taggaattaa attgaactta ccagatttaa cttttgacac agatcaatta attaacaaaa 

6931 tcggaactat tgtacttaaa aataaattta acgaggaaca aggcaagtat tttgtaagac tctcatatgt 

7001 aaaagtttgg aataaagacg atgaagtagt taataaacca gaacctaaaa ctgatgagat gaaacaaaaa 

7071 gaacagcaag caaatggtaa acagacacct atgagtcaac aatcaaaccc attcgctaat gctaatggtc 

7141 caatagaaat caatgatgat gatttaccgt tctaggacgt ggtttaaatg caatacatta caagatacca 

7211 gaaagacaat gacggtactt attccgtcgt tgctactggt gttgaacttg aacaaagtca cattgattta 

7281 ctagaaaacg gatatccgct aaaagcagaa gtagaggttc cggacaataa aaaactatct atagaacaac 

7351 gcaaaaaaat attcgcaatg tgtagagata tagaacttca ctggggcgaa ccagtagaat caactagaaa 

7421 attattacaa acagaattgg aaattatgaa aggttatgaa gaaatcagtc tgcgtgactg ttcaatgaaa 

7491 gttgcgagag agttaataga actgattata tcgtttatgt ttcatcatca aatacctatg agtgtagaaa 

7561 cgagtaagtt gttaagcgaa gataaagcgt tattatattg ggctacaatc aaccgcaact gtgtaatatg 

7631 cggaaagcct cacgcagacc tggcacatta tgaagcagtc ggcagaggta tgaacagaaa caagatgaat 

7701 cactacgaca aacatgtgtt agcactgtgt agacaacatc ataatgaaca gcacgcaatt ggtgttaagt 

7771 cgtttgatga taaatatcaa ttgcatgact cgtggataaa agttgatgag aggctcaata aaatgttgaa 

7841 aggagagaaa aatgaataag ttactaatag atgactatcc gatacaagta ttaccgaaat tagctgaatt 

7911 aatagggtta aacgaagcaa tagtattgca acaaattcat tattggctaa acaactcaaa acataaatac 

7981 gatggcaaaa cttggatttt taattcttat ccagaatggc aaaaacaatt tccattttgg agcgagagaa 

8051 ctataaaaag gacatttggg agtttagaaa aacaaaattt attgcatgta ggtaactaca acaaggctgg 

8121 acttgaccgt acaaaatggt attcaatcaa ttatgaaaca ttaaacaaac tagtggcacg accatcggga 

8191 caaaatggcc cgacgatgag gacaaattgg cacgatgcaa gaggacaaaa tgacccgacc aataccatag 

8261 actacacaga gactaacaaa catagagaga cagacgacgt ctcaaagtca tttaagtata ttagtaccaa 

8331 tttagaaatt atacaaaacc ctttaaaagc agaacagtta gaacacgaaa ttaaatcatt taagcaagat 

8401 cagttcgaaa tagtaaaagt cgctaccgat tactgcaaag aaaacaacaa aggtctgaat tacttactaa 

8471 ctgtattaaa gaactggaat aaagaaggcg Cttcagataa agaaagtgct gaaaacaaat tgaaacctcg 

8541 taactctaaa aaagaaacta ctgatgatgt catagcacaa atggaaaaag aattgagtga tgactaatgc 

8611 cgatgagcaa aacacaagca ttagaaatta ttaaaaaagt taggtacgta tacaacatcg attttgataa 

8681 accaaagtta gaaatgtgga ttgatgtatt aagtcaaaac ggggattatc aaccaactgt aaaagctgta 

8751 gatggatata tcaacagtaa caacccgtac ccgcctaacc taccagcaat catgcgtaag gcacctaaaa 

8821 aagtatctat tgagccggta gacaacgaaa ccgctacaca ccaatggaaa atgcagaatg accccgaata 

8891 tgtcagacaa agaaaaatag cgctagataa cttcatgaat aagttggcag aatttggggg cgataacgaa 

8961 tgaattacgg tcaatttgaa attgaaagca caataatcgc tacgctactt aaacaaccgg acgtactaga 

9031 aaagataaga gttaaagatt acatgtttac gaacgaaaag tttaaaacct ttttcaatta tgtaatggac 

9101 gtcggaaaga tagatcatca agaaatctat ttaaaagcaa ctaaagataa agagttttta gatgcagata 

9171 ctataactaa actttacaac tccgatttca ttggatacgg attctttgaa cgttatcaac aagaattatt 

9241 ggaaagttat caaatcaaca aagcgaaaga attggtaact gagttcaaac aacaacctac gaaccaaaat 

9311 tttaataact tgattgatga actcaaggat ttaaaaacaa ttactaacag aaaagaagac ggaaccaaga 

9381 agttcgttga ggagtttgtc gatgagttat acagcgatag ccctaagaag caaattaaga cgggttataa 

9451 gctcatggat tacaaaatag ggggattgga gccgtcgcaa ttaatcgtca tcgcagcgcg tccctcagtg 

9521 ggtaagacag gttttgcatt aaacatgatg ctgaacatag cacaaaatgg atacaaaaca tctttctcca 

9591 gtctcgaaac aactggcaca tcagtattga aacgtatgtt atcaacaatt actggtattg agttaacaaa_ 

9661 gataaaagaa atcaggaact taacgccgga tgacttaaca aagttaacga atgcgatgga taaaatcatg 

9731 aaattaggca tcgatatttc tgataaaagt aatatcacac cgcaagatgt gcgagcgcaa gcaatgaggc 

9801 attcagacag gcaacaagtt atttttatag attatcttca actgatggat actgatgcga aagttgatag 

9871 acgtgtagca gtagaaaaga tatcacgtga cttaaagata atcgctaacg agacaggcgc aatcatcgta 

9941 ctactttcac aactgaatcg tggtgtcgag tctagacagg ataaaagacc aatgctatcg gacatgaaag 

10011 aatcaggcgg aatagaagca gatgcgagtt tagcgacgct actttaccgt gatgattatt ataaccgtga 



WO 00/32825 PCT/I B 99/02040 



192 

10081 cgaagatgac agtatcactg gcaaatctat tgttgaatgt aacatagcca aaaacaaaga cggcgaaacc 

10151 ggaataattg aatttgagta ctacaagaag actcagaggt ttttcacatg aatataatgc aattcaaaag 

10221 cttattgaaa tcgatgtatg aagagacaaa gcaaagcgac ccgattgtag caaatgtata tatcgagact 

10291 ggttgggcgg tcaatagatt gttggacaat aacgagttat cgcctttcga tgattacgac agagttgaaa 

10361 agaaaatcat gaatgaaatc aactggaaga aaacacacat taaggagtgt taaaaaatgc cgaaagaaaa 

10431 atattactta taccgagaag atggcacgga agatattaag gtcatcaagt ataaagacaa cgtaaatgaa 

10501 gtttattcgc tcacaggagc ccatttcagc gacgaaaaga aaattatgac tgatagtgac ctaaaacgat 

10571 ttaaaggcgc tcacgggctt ccatatgagc aagagctagg attgcaagca acgatatttg atatttagag 

10641 gtggcacaat gagtaaatac aatgctaaga aagttgagta caaaggaatt gtatttgata gcaaagtaga 

10711 gtgcgaatat taccaatatt tagaaagtaa tatgaatggc actaactatg accgtatcga aatacaaccg 

10781 aaatttgaat tacaacctaa attcgggaaa caaagaccga ctacgtatat agccgatttc tctttgtgga 

10851 aggaagggaa actggttgaa gttatagacg ttaaaggtaa ggcgactgaa gttgccaaca tcaaagcgaa 

10921 gatattcaga tatcagtata gagatgtgaa tttaacgtgg atatgtaaag cgcctaaata cacaggtcaa 

10991 gaatggatgg tatatgagga cttagtgaaa gtcagacgta aaagaaaaag agaaatgaag tgatctaatg 

11061 caacaacaag catatataaa cgcaacaatt gatataagaa tacctacaga agttgaatat cagcattacg 

11131 atgatgtgga taaagaaaaa gatacgctgg caaagcgctt agatgacaat ccggacgaat tactaaagta 

11201 tgacaacata acaataagac atgcatatat agaggtggaa taaatgaagt tgaacgaagt attcgcaact 

11271 aatttaaggg taatcatggc tagagataac gtaagtgtcc aagatttgca caatgaaact ggcgtatcaa 

11341 gatcaactat tagtggatat aaaaacggaa aagctgagat ggttaactta aatgtattag ataaattggc 

11411 agatgctcta ggtgttaatg taagtgaact atttactaga aatcacaaca cgcacaaatt agaggattgg 

11481 attaaaaaag taaatgtata gaggtggaat aaatgagtat cgtaaagatt aacggtaaac catataaatt 

11551 taccgaacat gaaaatgaat tgataaaaaa gaacggttta actccaggaa tggttgcaaa aagagtacga 

11621 ggtggctggg cgttgttaga agccttacat gcaccttatg gtatgcgctt agctgagtat aaagaaattg 

11691 tgttatccaa aatcatggag cgagagagca aagagcgtga aatggttagg caacgacgta aagaggctga 

11761 actacgtaag aagaagccac atttgtttaa tgtgcctcaa aaacattctc gtgatccgta ctggttcgat 

11831 gtcacttata accaaatgtt caagaaatgg agtgaagcat aatgagcata atcagtaaca gaaaagtaga 

11901 tatgaacaaa acgcaagaca atgttaaaca accggcgcat tacacatacg gcaacattga aattatagat 

11971 tttatcgaac aggttacggc acagtatcca cctcaactag cattcgcaat aggtaatgca atcaaatact 

12041 tgtctagagc accgttaaag aatggtcatg aggatttagc aaaggcgaag ttttacgtcc aaagagcttt 

12111 tgacttgtgg gagggttaac gatggcaacg caaaaacaag ttgattacgt aatgtcatta caggaacaat 

12181 tgggattaga agactgtgaa aaatatacag acgaacaagt taaagctatg agtcataaag aagttagcaa 

12251 tgtgattgaa aactataaga caagcatatg ggatgaagag ctatataacg aatgcatgtc gtttggtctg 

12321 cctaattgtt aaaaggagtg atgaccatga acgatagcgc acgcaaagaa tacttaaacc aatttttcag 

12391 ctctaagaga tatctgtatc aagacaacga gcgagtggca catatccatg tagtaaatgg cacttattac 

12461 tttcacggac attataaaac gatgtttaaa ggcgtgaaaa agacatttga tactgctgaa gagctcgaaa 

12531 tatatataaa gcaacatgat ttggaatatg aggaacagaa gcaaccaact ttattttaga ggagatggaa 

12601 ataatggcaa agattaaaag aaaaaagaag atgacgctac tcgaactggt ggaatgggca tggaacaatc 

12671 ctgaacaagt tgaaagtaaa gtgtttcaat cagatagaat gggcacgctt ggagaatgta gcgaagtaca 

12741 tttttcaact gatgggcatg ggttttatac aaaagtagta acagataaag atatttttac tgtagaaatc 

12811 acagaggaag tcactgaaga tactgagttt gattgtctag tagaactaaa cgatattgaa ggttttgaaa 

12881 tatatgaaaa tgattcaatc agagagttga tagacggtac ttccagagcg ttttatatac taaacgaaga 

12951 taaaactatg acattaattt ggaaagatgg ggagttggta gtatgatgca aacctataaa gtatgtcttt 

13021 gtatcaagtt ctttgcatct aaatgtgatt ataaattaaa gaaacattat ttcgtgaaaa gtacgaatga 

13091 ggaaaaagcc acgaacatgg tattaaaact gattcgtaaa aagctcccgt tcgaaactgc aagcatagaa 

13161 gtcgaaaaag tggaggcaat ataatgatac aaccaacaag agaagaatta attaatttca tgaaaaaaca 

13231 tggagctgaa aatgttgact ctatcactga tgagcaaagt gcaataagac actttagagc tcaatcaaaa 

13301 gtttttaaag acgaacgtga tgagtacaag aagcaacgag atgagcttat cgaggatata gctaagttaa 

13371 gaaaacgtaa cgaagagctg gagaacatgt ggcgcacagt caaaaatgaa ttgcttggaa gatacgaaca 

13441 ttactgtttt aaaattagag aactacaccc tgagagcaaa gcgaacagga taggagctct ctatatagga 

13511 ggtaaaagca ctgcagatat tatactgtcg cgaatggaag aactagacgg aacaaatgag ttctacgaat 

13581 ttttagggca aatggaggca gacacaaatg aataaccgtg aacaaataga acaatcagcg atcagtacta 

13651 gtgcgtataa cggtaatgac acagaggggt tactaaaaga gattgaggac gtgtataaga aagcgcaagc 

13721 gtttgatgaa atacttgagg gaatgacaaa tgctattcaa cattcagtta aagaaggtat tgaacttgat 

13791 gaagcagtag gggttatggc aggtcaagtt gtctataaat atgaggagga gcaggaaaat gagtattagt 

13861 gcaggagata aagtatataa ccatgaaaca aacgaaagtc tagagattgt gcaattggtc ggagatatta 

13331 gagatacaca ttataaactg tctgatgatt cagttattag cattatagat tttattacta aaccaattta 

14001 tctaactaag ggggacgagt gagtggaatg gaaacgatta aaaaatgtgg tgccgcaccc agttatcaaa 

14071 aataaaaatc taaagtcggt atacgtaaca aaagataatg tgaaagaggt tcaaaaagaa ttaggtttct 

14141 ttgaaatttt taatgaagaa gtgttattaa ctggattttt atcatttcaa aggataccta tttacattat 

14211 ttggattaat cctaaatctc ataagacgcc tagatattac tttgctaacg agcatgagat tgaaagatat 

142 81 tttgaatttt tggaggacga gtaaatgctt gaaatcatcg accaacgtga tgcattgcta gaagaaaagt 

14351 atttaaacga cgactggtgg tacgagctag attattggtt gaacaaacgc aagtcagaaa atgaacagat 

14421 tgatatcgat agagtgctta aatttattga ggaattaaaa cgataggaga taacgaataa atgaataatt 

14491 taacagtaga tcaattaaaa gaacttttac aaatacaaaa ggagttcgac gatagaatac cgactagaaa 

14561 tttaaatgac acagtagcta gtatgattat tgaatttgcg gagtgggtta acacacttga gttttttaaa 

14631 aattggaaga aacaaccagg taagccatta gatacacaat tagatgagat tgctgattac ttagctttca 

14701 gtttgcaatt aactctgact attgttgatg aagaagattt ggaagagact actgaggtta tggttgattt 

14771 gattgaaaat gaagttactt tacctaaact acattcaget tattttgttc atgtaatgca tacactaaca 

14841 gaacaatttg taaaaggtat tgataatagt attgtacaag ttttaataat gccttttttg tacgccaata 

14911 cttactatac aatcgaccaa ctcattgacg catacaaaaa gaaaatgaaa aggaaccacg aaagacaaga 

14981 tggaacagca gacgcaggaa aaggatacgt gtaaagacat cttagatcga gtcaaggagg ttttggggaa 

15051 gtgacgcaat acttagtcac aacattcaaa gattcaacag gacaaccaca tgaacatttt actgctgcta 

15121 gagataatca gacgtttaca gttgttgagg cggagagtaa agaaggagcg aaagagaagt acgagaaaca 

15191 agttaagata aggagagatg gagatgccaa agaaaacggt aacgattgat gtagatgaaa acttattagt 

15261 agtagctagt aatgaaatat cagaactatt atatgaatat gacagtgagt taatgtcagc tgatgaagat 

15331 ggcgataata gagatatcga aaaaaaaaga gacgcattaa aacaagctat acaaattatc gataaattaa 
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15401 catgtcgagg aggcagacga tgattaacat acccaaaatg aaattcccga aaaagtacac cgaaataatc 

W«l aagaaaLla allataaaac acctgaagaa aaagctaaga ttgaagatga ctteattaaa gaaattaatg 

15541 ataaagacag cgaattttac agtcctatga tggctaatat gaatgaacat gaattaaggg ctatgttaag 

lllll aatgatgccl agtttaattg atactggaga tggcaacgat gattaaaaaa ctcaaaaata tggattggtt 

lllll cgalatlttt attgctggaa tactgcgatc attcggcgta accgcactga cgectgttgt catatcgcct 

lllll a?ctatacag tggltagita ccaaaacaaa gaagtatatc aagggacaat cacagataaa tataacaaga 

US21 gacaagatal agaagacaag ttctatattg tgttagacaa caagcaagtc atcgaaaact ctgacttact 

"b91 attcaLaag aaat?tgata gcgcagacac acaagctagg etaaaagtag gcgacaaagt agaagttaaa 

"j'l acga^ggti atagaalaca etttttaaat ttatatccgg tcttatacga agtaaagaag gtagataaat 

16031 aafgatfLa caaatattaa gactattatt cttactagcg atgtatgagc taggtaagta tgtaactgag 

ifiioi aaaqtatata ttatgacgac ggctaatgat gatgtagagg cgccgagtga cttcgcaaag ttgagcgatc 

"m agtctg"" gatgagg^cg gaggtgtcag agtagatgta tagcaaagag tcaattgtta atatgatagg 

16241 cacacltaaa Lgaagtgta atgtattagc tgatgtaata ccggaatatg atagcaactc aattgcacag 

16311 taSgcatac aagcaacgtt gccgaaacca caaggggaaa actcaagtaa agttgaagac gttgttgtga 

16381 ggc^gagag agcaaataaa aggtatgctc agatgttaaa agaggttgag ettataaacc aatcgcaaca 

1MK gaga«ggga cacgttgact tttgcttctt agagttattg aagaaaggtt acaacaggga tgcgattatc 

lllll aagaaglfgc ctaactctaa attaaataga aacaacttct tagcgcgccg tgatgagtca gcagaaaaga 

UNI tttatctalc acagtgacga aaatgacaaa aatgacagaa atgacgaaaa tgacaccatt "taaactgt 

lllll gaattaattt tatataattg atttgtaaga attatcttaa gacgtggggt aatagccaca ttagatgttc 

16731 Icatcgatgt gattgagaag egacaaacat ataaaagatg atatgttacg ctattaatca cctactacct 

lllll gcctSatgg fggg?agttt aattcttgca ttttgagtca taactatttt cctcctttca catttattga 

lllll a^gtagctlc tglacaagat gtaggggcat tttttatatt taaataacta gagtaattaa cgtaaaggcg 

lllll tglgafacag tgaaaacaat tgattaaatt aacaccgaag caagaaaagt ttgtgctagg actcatagag 

17011 ggcaagagcc aacggaaagc atatattgac gcagggtatt cgactaaagg taagagtggg gaatatctag 

17081 ILaagaagc gagiacactt tttaaaaatc ggaaggtttc cggaaggtac gaaaaattgc gtcaagaagt 

17151 agctglacaa fcaaaatgga cacgccaaaa ggcctttgaa gaatatgagt ggctaaagaa tgtagctaag 

"221 altglcattg aaatagaggg agtgaagaaa gcgacagctg atgcattcct cgctagttta gatggtaega 

lllll atagaatgal gttaggtaac gaagttttag ctaaaaagaa aatagaaact gaaattaaga tgcttgagaa 

17361 gaagattgaa laaaiagata aaggtgacag tggaacagaa gataaaatca aacaacttca cgacgcaata 

17431 acgiaagtga tcgtcaatga ataaacttaa atctttatat acggacaaac aaattgaaat attgaagcaa 

17W1 acgcaaaaac aagattggtt tatgttaatt aatcacggag caaagcgtac aggtaaaaca atattaaaca 

17571 atgacttatt tttacglgag ttaatgcgtg tgcgaaagat agcagacgaa gaaggaattg agacacctca 

"ill acltatactt gctgglgcaa cattaggtac gattcaaaaa aacgtactaa tagagttaac taacaaatat 

17711 ggcattgagt Itaaltttga taaatataat tcattcatgt catttggcgt tcaagtggtt cagacaggtc 

17781 acagtalagt aagtggtata ggagctatac gtggtatgac atcgtttggt gcatatatca "gaagcgtc 

lllll g«a«gcat galgaggtgt Itgacgagat taagtcacgc tgtagtggaa ctggtgcaag aatattggta 

lllll gatalcaacc Itgaccltcc cgagcattgg ttgttgaaag attatattga aaatacagat cctaaagcag 

17991 gtatactgag tclccaattt aagctcgatg acaataactc tcttaatgat agatataaag ^gtctattaa 

18061 ggcttcalcl ccatcaggta tgttctatga acgtaatacc aacggtatgt gggtgtctgg tgacggtgta 

16131 gtatatgccg actttgattt gaatgagaat acgatcaaag cagatgaact ggacgacata "tatcaaag 

i«oi aatactttgc tggtgtcgac tggggttacg agcactatgg atctattgtg ttaataggac gaggtataga 

lllll ^ggtaacttt ta!t?ta?tg agglgcacgc acaccaattt aagtttattg atgactgggt 99ttattgca 

18341 aaagatattg taagtagata tggcaatatc aacttttact gcgatactgc acgacctgaa tacatcactg 

18411 aatltagaal acalagatta cgtgeaatta acgctgataa aagtaaaeta tcgggtgtgg a99aagttgc 

18481 eaagttgttc aaacaaaaca agttactcgt tctttatgat aatatggata ggtttaagca agaggtattt 

iassi aaatatottt ggcaccceac aaacggagag cccataaaag aatttgacga cgtgttggac tcgttaagat 

lllll atgccafata Hcacatact aaaccfglac gactaaggag ggggaaatga cattgtataa gttaatagat 

18691 gatattgaag cacaaggaat ategcctaag catattgagg ctctaataga gtcacataaa gacgatagag 

level Lagaatgge taatctctat aatagataca agacacatat tgactatgta ccaatattca aacgtcgacc 

18831 altfgaagla aaagaagatt ttgaaactgg tggaaatgta aggcgattag acgtgtctgt taataacaaa 

18901 cttaacaact cttttgacag cgaaattgtt gatacacgtg ttggttattt acatggtgtt cctgttactt 

lllll atgatttaga tgaaalcgca gaaaaaaacg aaaagtcgaa aaagtttata accaactttg ccattagaaa 

19041 tagtgttgat gatgaggatt ccgaaatagg taaaatggca gcaatttgcg gatatggtgc taggttagca 

""l tawftglta cgaatlgtga tattaggatt aagaatatag atccctataa tgttattttt gttggcgaca 

"l8l atatttfaga acctalltlc tcattgcgct acttttatga aaaagatgat gataatggca ctgattatgt 

19251 gtacgcagag ttttacgata atgcttatta ttatgtattt cgaggagaag gtattgacgc tttgcaagaa 

19321 gttggacgat atgaacattt atctgattac aatccattgt ttggtgtacc taacaacaaa gagatgatag 

lllll gagl?gclga aaaggttatt cacttaattg acgcatatga tttaacaatg agcgatgcat caagtgagat 

lllll ta^tclgaca cgtllagcat accttgtgtt acgcggtatg ggtatgagtg aagaaatgat tcaagaaaca 

19531 caLagagtg glgcatttga gttgttcgac aaagatatgg acgttaaata cttaacaaaa gatgtaaatg 

19601 acacaatgat Igagaaccat ttagatcgaa tcgaaaagaa tatcatgcgt tttgcaaagt cagtaaactt 

19671 taattccgac gagtctaacg gaaatgtacc tatcattgga atgaaactta aacttatggc tttagagaac 

19741 aagtgcatga cgtttgagcg taagacgaca gctatgttga ggtatcaatt caaagttatt ttatctgcat 

19811 caaagcgtL agggtacaac ttggatgatg atagttattt aaacccgata tttaagttca ctcgtaacat 

19881 tccagttaat aagttagaag aatcacaagt gctaattaac ctgaagggac aagcttcaga acgaacaagg 

19951 ttaggacaat cacaactagt tgatgatgtt gattacgaat tagacgaaat ggaaaaagaa agtcttgaat 

20021 ttaatgacaa attacctgac atagatgaag gtgacgcaaa tgacaaatcc caaaataacc aatcagaatg 

20091 acatcgatga gtatacogag ggtttaatct ctaaagcaga aaaaccaata gaacaactat ttgctaatcg 

20161 acttaaagag ataaaacaaa ccaccgcaga tatgtttgag aaatatcaaa atgatgatgt gtatgttaca 

20231 tggactgaat tcaataaata caacaggctc aataaggagt taactcgtat aggtacaatg ttgacttatg. 

20301 actacaggca agtagctaag atgatccaga agtcacaaga agatgcttat atagaaaaat tccttatgag 

20371 cctttat?ta tatgaaatgg cgagtcaaac atctatgcag tttgatgttc cgagtaaaga ggtaatcaaa 

20441 tcagctattg aacaacctat tgagttcatt cgtctaatgc caacactaca aaaacaccgc gatgaagtat 

20511 tgaaaaagat acgtatgcac attacacaag gtattatgag tggagagggt caccctaaga cagctaaagc 

20581 aatacgtgat gatgtcggca tgtctaaagc tcaatcateg cgtgcggctc gtacagaagc aggcagagca 

206S1 atgtcacaag ctggacttga Cagcgcaatg gtcgctaaag ataacggttt gaatatgaag aaacgttggc 
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26041 gatttccaaa gcaacgtaag gaaagctcaa cgattagcaa agacgtctgt accaaacgaa attgaaacag 

26111 atgtaaaagc agatatttca agattccaaa gagctttaca acgcgctaaa tcaacggctc aacgatggcg 

26181 agagcattct gctaaattat tcatgaaaac agatgagtat aaagcgaatt tagaacgcgc taaagctcaa 

26251 gtagagcgat ttaaacaaca taaagtagat ttgaaactaa gcaacactga attaatggcc aaatataatg 

26321 caactaaagc tactgtcgaa gcttggagaa aacatgttgt taagttggat ttagacgcaa accccgctaa 

26391 aatggcggtt aaagggttta aagaagattt aatagatctt agcaggcata gttttgatat tgattccagc 

26461 agatggaaat taggaaataa attcacaaaa gaattcaatg aagtcgaagg agcagttaaa cgttctttcg 

26531 gaagaattgg tcagattatg agaaaagaag taaatggaac aagtgatatt tggggtaaac ttaacaactc 

26601 attgaaagat tacggcgaga aaatggacgc cttagctact aaaatccgaa ctttcggtac tatcttcgcg 

26671 caacaggtca aaggcttaat gattgctagt atacaagcat tgataccagt gattgccgga ttagtacctg 

26741 caataatggc agtacttaat gcggttggtg tattaggtgg tggcgtttta ggtttagttg gcgcattctc 

26811 tgtcgcaggt cttggagttg ttggctttgg cgcaatggct attagcgctc ttaaaatggt tgaagatgga 

26881 acattggcag taacaaaaga agttcaaaac tttagagatg cgagcgatca gttaaaaact acatggcgtg 

26951 atattgttaa agagaatcaa gcaagtatct ttaatgcgat gtcagcaggt atcagaggcg ttacaagtgc 

27021 gatgtctcaa ttaaaaccat tcttatccga agtatctatg ctagttgaag caaacgcacg cgagtttgag 

27091 aattgggtta aacattccga aacagctaag aaagcgtttg aagcattgaa tagcataggt ggcgcaatct 

27161 tcggagattt attgaacgct gcaggacgat ttggcgacgg attagttaac attttcactc aattaatgcc 

27231 gttgttcaaa tttgtgtctc aaggactaca gaacatgtct atagctttcc aaaattgggc taatagtgta 

27301 gctggtcaga atgctattaa agcgtttatt gactacacta ccactaactt acctaagatt ggtcagatat 

27371 ttggtaatgt gttcgctggt attggtaatt taatgattgc ttttgcacaa aacagttcca acatttttga 

27441 ttggttggtt aaattaactt ctcaatttag agcatggtca gaacaagtag gacaatcaca agggtttaaa 

27511 gactttatca gttatgttca agagaatggt cctactatta tgcagttaat cggtaatatc gtaaaagcat 

27581 tagttgcttt tggtactgca atggctccta tagctagtaa attgttagac tttatcacta atctagctgg 

27651 atttatcgct aaactattcg aaacacaccc agctatagca caagttgctg gcgttatggg tattttaggc 

27721 ggtgtatttt gggctttaat ggctccgatt gttgctataa gtagtgtact tacaaatgtg tttggtttga 

27791 gcttattcag cgtcactgaa aagattttag acttcgttag aacatcaagt ttagttactg gagctacgga 

27861 agcattaata ggtgcattcg gttcgatttc agcacctatt ttagcagttg ttgcagtaat tggtgcattc 

27931 attggtgtcc tcgtttattt atggaaaaca aacgagaact ttagaaatac tattactgaa gcgtggaacg 

28001 gtgttaaaac ggcagtttct ggtgcgattc aaggtgtagt cggctggtta actgaattgt ggggcaaaat 

28071 ccaatctacc ttacaaccga taatgcctat attgcaagta ttaggacaaa tattcatgca agttttaggt 

28141 gttttggtaa taggcatcat tacaaacgtt atgaatatca tacaaggttt gtggacttta attacaattg 

28211 cgttccaagc cataggaaca gtgatatccg tagcagtcca aatcatagta ggtttgttca ctgctttaat 

28281 tcagttgctt actggcgact tctcaggtgc ttgggagact attaaaacta cggttaccaa tgtgcttgat 

28351 acgatttggc aatacatgca atcagtttgg gagtcaatta tcggcttttt aactggcgta atgaatcgaa 

28421 cactttctat gtttggtaca agttggtcac agatatggag tacaatcact aattttgtta gcagtatttg 

28491 gaacactgtt acaagttggt tcagtcgagt ggcttcgagt gtagctgaaa aaatggggca agcactaaac 

28561 tttattatca caaaaggttc tgaatgggtt tctaacattt ggaatacagt tacaagtttc gcgagtaaag 

28631 tagctgatgg gtttaaaaga gttgtctcaa atgtaggtga cggtatgagt gatgcacttg gtaagattaa 

28701 aagtttcttc agtgatttct taaatgccgg agcggaatta atcggcaaag tagctgaggg tgtagccaaa 

28771 tctgcgcaca aagtagtcag cgcggtaggc gatgcgattt catcagcttg ggactctgta acttcattcg 

28841 taagtggaca cggtggaggt agtagcttag gtaaaggttt agcggtatca caagcaaaag taattgctac 

28911 agactttggc agtgccttta ataaagagct atcctctact ttgacagata gtatagtaaa tcctgtaagt 

28981 acttctatag acagacacat gactagcgat gttcaacata gcttaaaaga aaataataga cctattgtga 

29051 atgtaacgat tagaaatgag ggcgaccttg atttaattaa atcacgcatt gatgacatga acgctataga 

29121 cggaagtttc aacttattat aagggaggtt tgttagttga tagcgcacga tatagaagta ataaggaatg 

29191 gttcacagea tcgcgtcagt gacaatcctt tcacttataa tcacttggaa gtagttgaat ataacgttac 

29261 aggcgcagga tatcatcgta actattctga tatagagggt attgatggta gatttcataa ttacgctaaa 

29331 gaagaactta aaaaagtaga gcttaagata aggtataaag tacctaaaat tgcttatgct tcacatttaa 

29401 agtcagacgt ccaagcacta tttgctggac gtttttattt aagggaatta gctacaccag acaattcaat 

29471 taagtatgag catatattag atataccaaa agacaaacaa gcatttgagc ttgatcatgt tgatggacga 

29541 caactttttg taggactagt aagtgaagtt tcttttgaca caacacaaac atcaggggaa ttttctttgt 

29611 cgtttgaaac aaccgaacta ccatactttg aaagtgtcgg ttatagtact gatcttgaaa gtaataacga 

29681 ccctgaaaaa tggtcggtac ctgatagatt gcctacaaac gaaggtgata agaggcgtca aatgacattt 

29751 tacaacacta actcaggaga agtttattat aacggtgatg ttcctttaac acagtttaat cagtttaatg 

29821 ttgttgaaat agagttagct gaagatgtta aagctaatga taaggatgga ttcactttct atacagataa 

29891 aggaaatatc tcagttatta aggaagttga tttaaaagcc ggagataaaa taatcttcga cggtaaacat 

29961 acctatagag gttatttaaa tatagattct tttaataaaa ctttagaaca accggtttta tatccaggct 

30031 ggaatcgatt caagtctaat aaagtaatga aacaaattac atttagacac aaattatatt ttagataagg 

30101 agtagcctat gccaatttta ttaaaaagtc tacagggtgt agggcacgct attaatgtta grtacaaaggt 

30171 aagtaaaaag ctaaatgaag atagttcttt ggatctaact attatcgaga acgcgagtac gtttgacgca 

30241 ataggtgcta caactaaaat gtggacgatc actcatgttg aaggtgaaga tgatttcaac gaatatgtaa 

30311 ttgtcatact tgataagtct actattggcg aaaaaataag gcttgatatc aaagctaggc aaaaagaact 

30381 tgatgacctt aacaattcta ggatttacca agagtataac gaaagtttta caggcgttga gttcttcaat 

30451 actgtcttta aaggaacggg ttataagtat gtattacatc caaaagtaga tgcatctaaa ttcgagggat 

30521 taggcaaagg agatacacga ttagaaatct ttaaaaaagg acttgagcgt tatcatctcg aatatgaata 

30591 cgatgcaaag actaaaacgt ttcatttgta tgatgaatta tctaagtttg ccaattatta cattaaagct 

30661 ggtgtgaatg ccgataacgc caaaatacaa gaagatgcat ctaaatgtta tacctttatt aaaggttatg 

30731 gtgattttga tggacaacag acttttgcag aagcgggact acaaattgaa ttcactcatc cattagcaca 

30801 attgataggt aaaagagaag cgccaccgct tgttgatgga cgtattaaaa aagaagatag tttaaaaaaa 

30871 gcaatggagt tattgataaa gaaaagtgtc actgcttcta tttccttaga cttcgtagcg ttacgtgaac 

30941 atttcccaga agctaaccct aaaataggtg atgttgttag agtggtggat tctgccatag gatataacga 

31011 cttagtgaga atagtcgaaa tcactacaca tagagatgcg tacaataata tcactaagca agatgcagta 

31081 ttaggagact ttacaaggcg taatcgttat aacaaagcag ttcatgatgc tgcaaattat gttaaaagcg 

31151 taaaatctac aaaatccgac ccatctaaag aactaaaagc attaaacgca aaagttaacg caagtttatc 

31221 tataaataat gaattggtta agcagaatga aaaaataaac gctaaagtcg ataagatgaa tactaaaaca 

31291 gttacaactg ccaatggtac gatcatgtac gactttacta gtcaatcaag tataagaaac atcaaatcaa 
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36681 ggtttggtaa atgtaaataa cagttaagag tcagtgcttc ggcactggct ttttattttg attgaaatga 

36751 ggtgcataca tgggattacc taacccaaag actagaaagc ctacagctag tgaagtggtg gagtgggcaa 

36821 agtcgaatat tggtaagagg attaatatag ataattatcg gggcagtcaa tgttgggata cacctaactt 

36891 tatttttaaa agatattggg gttttgtaac atggggcaat gctaaggata tggctaatta cagatatcct 

36961 aagggtttcc gattctatcg ttattcatct ggatttgtac cggaacctgg agacatcgca gtttggcacc 

37031 ctggcaacgg aataggttcg gacggacaca ccgcaatagt agtaggacca tctaataaaa gttattttta 

37101 tagcgttgac caaaactggg ttaattctaa tagttggaca ggttctccag gaagattagt aagacaccct 

37171 tatgtaagtg ttacaggctt tgttaggcct ccatactcaa aagatactag caaacctagt agtactgata 

37241 caagttcagc atcaaaagcc aatgactcaa caattactgg cgaagcgaag aaaccgcaat ttaaagaagt 

37311 taaaacagta aaatacactg cttacagcaa tgttttagat aaagaagagc acttcattga tcatatagtt 

37381 gtaatgggtg atgaacgctc agatattcaa ggattatata taaaagaatc aatgcatatg cgttctgtag 

37451 acgaactgta tacgcaaaga aataagttta taagcgatta tgaaataccg catttatatg tcgatagaga 

37521 ggctacatgg cttgctagac caaccaattt tgatgacccg cgtcacccta attggctagt tattgaagta 

37591 tgtggtggtc aaacagatag caaacgacaa ttcttattga atcaaataca agcgttaata cgtggtgttt 

37661 ggttattgtc agggattgat aaaaacttat ctgaaacgac gttaaaggta gaccctaata tttggcgtag 

37731 tatgaaagat ttaattaatt acgacttgat taagcaaggt ataccggata acgcaaagta tgagcaagtt 

37801 aaaaagaaaa tgcttgagac atacattaaa cgagatatat tgacacgaga aaatataaaa gaagtaacga 

37871 caaaaacaac aataagaatt agtgataaaa catcagttga cagtgcgtcc acacgaggcc ctactccatc 

37941 agacgaaaaa ccaagcatcg ttactgaaac aagtccattc acattccagc aagcactgga tagacaaatg 

380X1 tctaggggta acccgaaaaa atctcataca tggggctggg ctaatgcaac acgagcacaa acgagctcgg 

38081 caatgaatgt taagcgaata tgggaaagta acacgcaatg ctatcaaatg cttaatttag gcaagtatca 

38151 aggcatttca gttagtgcgc ttaacaaaat acttaaagga aaaggaacgc tcgacggaca aggcaaagca 

38221 ttcgcggaag cttgtaagaa aaacaacatt aacgaaattt atttgatcgc gcacgctttc ttagaaagtg 

38291 gatacggaac aagtaacttc gctagtggta gatacggtgc atataattac ttcggtattg gtgcattcga 

38361 caacgaccct gattatgcaa tgacgtttgc taaaaataaa ggttggacat ctccagcaaa agcaatcatg 

38431 ggcggtgcta gcttcgtaag aaaggattac atcaataaag gtcaaaacac attgtaccga attagatgga 

38501 atcctaagaa tccagctacc caccaatacg ctactgctat agagtggtgc caacatcaag caagtacaat 

38571 cgctaagtta tataaacaaa tcggcttaaa aggtatctac ttcacaaggg ataaatataa ataaagaggt 

38641 gtgtaaatgt acaaaataaa agatgttgaa acgagaataa aaaatgatgg tgttgactta ggtgacattg 

38711 gctgtcgatt ttacactgaa gatgaaaata cagcatctat aagaataggt atcaatgaca aacaaggtcg 

38781 tatcgatcta aaagcacatg gcttaacacc tagattacat ttgtttatgg aagatggctc tatattcaaa 

38851 aatgagcccc ttattatcga cgatgttgta aaagggttcc ttacctacaa aatacctaaa aaggttatca 

38921 aacacgctgg ttatgttcgc tgtaagctgt ttttagagaa agaagaagaa aaaatacatg tcgcaaactt 

38991 ttctttcaat atcgttgata gtggtattga atctgctgta gcaaaagaaa tcgatgttaa attggtagat 

39061 gatgctatta cgagaatttt aaaagataac gcgacagatt tattgagcaa agactttaaa gagaaaatag 

39131 ataaagatgt catttcttac atcgaaaaga atgaaagtag atttaaaggt gcgaaaggtg ataaaggcga 

39201 accgggacaa cctggtgcga aaggtgatac aggtaaaaaa ggagaacaag gcgcacccgg taaaaacggt 

39271 actgtagtat caatcaatcc tgacactaaa atgtggcaaa ttgatggtaa agatacagat atcaaagcag 

39341 aacctgagtt attggacaaa atcaatatcg caaatgttga agggttagaa gataaattgc aagaagttaa 

39411 aaaaatcaaa gatacaactc tcaacgactc taaaacgtat acggattcaa aaattgctga actagttgat 

39481 agcgcgcctg aatctatgaa tacattaaga gaattagcag aagcaataca aaacaactct atttcagaaa 

39551 gtgtattgca acagattggc tcaaaagtta gtacagaaga ttttgaggaa ttcaaacaaa cactaaacga 

39621 tttatatgct ccaaaaaatc ataatcatga tgagcggtat gttttgtcat ctcaagcttt tactaaacaa 

39691 caagcggata atttatatca actaaaaagc gcatctcaac cgacggttaa aatttggaca ggaacagaaa 

39761 atgaatataa ctatatatat caaaaagacc ctaatacact ttacttaatt aaggggtgat ttttatggaa 

39831 ggtaatttta aaaatgtaaa gaagtttatt tacgaaggtg aagaatatac aaaagtatat gctggaaata 

39901 tccaagtatg gaaaaagcct tcatcttttg taataaaacc cttacctaaa aataaatatc cggatagcat 

39971 agaagaatca acagcaaaat ggacaataaa tggagttgaa cctaataaaa gttatcaggt gacaatagaa 

40041 aatgtacgta gcggtataat gagggtttcg caaactaatt taggttcaag tgatttagga atatcaggag 

40111 tcaatagcgg agttgcaagt aaaaatatca actttagtaa tccttcaggg atgttgtatg tcactataag 

40181 tgatgtttat tcaggatctc caacattgac cattgaataa ttttaaacga ctaatttttt agtcgttttt 

40251 tattttggat aaaaggagca aacaaatgga egcaaaagta ataacaagat acatcgtatt gatcttagca 

40321 ttagtaaatc aattcttagc gaacaaaggt attagcccga ttccagtaga cgatgagact atatcatcaa 

40391 taatacttac tgttgttgct ttatatacta cgtataaaga caatccaaca tctcaagaag gtaaatgggc 

40461 aaatcaaaag ctaaagaaat ataaagctga aaacaagtat agaaaagcaa cagggcaagc gccaattaaa 

40531 gaagtaatga cacctacgaa tatgaacgac acaaatgatt tagggtaggt gttgaccaat gttgataaca 

40601 aaaaaccaag cagaaaaaeg gtttgataat tcattaggga agcagttcaa tcctgatttg ttttatggat 

40671 ttcagtgtta cgattacgca aatatgtttt ttatgaeagc aacaggcgaa aggttacaag gtttatacgc 

40741 ttataatatt ccatttgata ataaagcaag gattgaaaaa tacgggcaaa taattaaaaa ctatgatagc 

40811 tttttaccgc aaaagttgga tattgtcgtt ttcccgtcaa agtatggtgg cggagctgga catgttgaaa 

40681 ttgttgagag cgcaaattta aacactttca catcatatgg gcaaaattgg aatggtaaag gttggacaaa 

40951 tggcgttgcg caacctggtt ggggtcctga aactgttaca agacatgttc attattacga tgacccaatg 

41021 tattttatta gattaaattt cccagataaa gtaagtgttg gagataaagc taaaagcgtt attaagcaag 

41091 caactgccaa aaagcaagca gtaattaaac ctaaaaaaat tatgcttgta gccggtcatg gttataacga 

41161 tcctggagca gtaggaaacg gaacaaacga acgcgatttt atccgtaaat atataacgcc aaatatcgct 

41231 aagtatttaa gacatgcagg tcatgaagtt gcattatatg gtggctcaag tcaatcacaa gacatgtatc 

41301 aagatactgc atacggtgtt aatgtaggaa ataataaaga ttatggatta tattgggtta aaccacaggg 

41371 gtatgacatt gttctagaga ttcatttaga cgcagcagga gaaaatgcaa gtggtgggca tgttattatc 

41441 tcaagtcaat tcaatgcgga tactattgat aaaagtatac aagatgttat taaaaataac ttaggacaaa 

41511 taagaggtgt aacacctcgt aatgatttac cgaacgttaa tgtatcagca gaaataaata tcaactatcg 

41581 tttatctgaa ttaggtttta ttactaataa aaaagatatg gattggatta agaagaatta tgacttgtat 

41651 tctaaattaa tagctggtgc gattcatggt aagcctatag gtggtttggt agctggtaat gttaaaacat 

41721 cagctaaaaa ccaaaaaaat ccaccagtgc cagcaggtta tacacttgat aagaataatg tgccttataa 

41791 aaaagagact ggtaattaca cagttgccaa tgttaaaggt aataacgtaa gggacggcta ttcaactaat 

41861 tcaagaatta caggtgtatt acctaataac gcaacaatca aatatgacgg cgcatattgc atcaatgggt 

41931 atagatggat tacttatatt gctaatagtg gacaacgtcg ctatattgcg acaggagagg tagataaagc 
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42001 aggtaatagg ataagtagtt ttggtaagtt 

42071 cattaatcat agggaatctt acagttatta 

42141 ttaacattac tctcaagatt taaatgtaga 

42211 taatgtaatt acattaccag taaccaatct 

42281 gaggacttac ttgcgtaaag tagtaagaag 

42351 gttgtttttt atgttatatt ataaatgatc 

42421 tatgcaaaaa aaacgaaaaa aagttcataa 

42491 ataccagttg agaggaggat aaaaagtgtt 

42561 atgtcagcaa ttgccatagc gaaaacattg 

42631 tctatatata aattctaaca ctaaaatact 

42701 taaacgtgtt tttaggcaac gatataagta 

42771 tttatggaag agggataaaa atgacagcaa 

42841 agaaacggga tataaaattg ctaaaaattc 

42911 aaaacatctt tatcagatgc cagatttaga 

42981 acgaagaaga taaataaaag gagccaaaaa 

43051 aaagaagtat ttgaatcagg taaaaacttt 

43121 atgatagata cgtagtactt gaccataaaa 

43191 caaaagaaaa ttagtaagtt aaataattag 

43261 cgcgtgtcaa atacgtgtca atttagttct 

43331 cgcatagtta taggcttttc agctatatac 

43401 tggaaacctt gatttaatgg ggttttaatc 

43471 cacgttgacc ttgctctttt ttatgttcat 

43541 ataatggcct aatcttttgc taatatattc 



198 

tagcacgatt tagtatttac ttagaataaa aatttcgcta 
aataactatt tggatggatg tcaatattcc tatacacttt 
taacaggcag gtactacggt acttgcctat ttttttgtta 
ggcttaaaac cacatttccg gtagccaatc cggctatgca 
ctgactgcat atctaaacca cccatactag ttgctgggtg 
aaaccacacc acctattaat ttaggagtgt ggttattttt 
aaagtattgc atatcacgtt taaccgtgtt ataataaggt 
agaaaatttt aaaactatag cagaaatcgc ctttcataca 
aaaaaagacg ataagtaagt agacaagccc gaaagggctg 
atgaaaacaa tttacattat tttaatcatt cttatttgga 
aaagtgttgt tgcactgctt actactttac tgcttatcaa 
taaaagaaat aattgaatca atagaaaagt tattcgaaaa 
cggattacca tatcaaactg tgcaagattt aagaaatgga 
acgataataa agttatacga gtatcaaaga tcgcttgaaa 
tatgtttgtt acaaaagaag aatttaaaac tttgaatgta 
ataaaaatta cagatggaag acatgcaata tattgggtaa 
aaggcgattt gtacccgcaa aaagcatacc caaaatatat 
aaaaccacgt cttaattgac gtggttattt tttaggtttg 
atttctttag ttttctttct aaacttaatt gcttgtaaac 
caagataaga tttatcccgc cgtctccata aaaatatgct 
tagcaagtgt caaatatgtg tcaagaaaat aattttctga 
caagtaagtg agagtaggtg tctaaagtta tagatatatt 
aatagg 
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Table 10 



Bacteriophage 96 ORFs list 



SID 


LAN 


FRA 


POS 


a . a . 


RBS sequence 


OTA 




100733 


96ORF001 


1 


25999 . . 29142 


104 7 


ccttgaatcgaaaggaggttagcct 


ttg 


taa 


100734 


96ORF002 


1 


32008. .33906 


632 


t t t 1 1 acgact aaaggaggcaacc a 


atg 


taa 


100735 


96ORF003 


1 


30109. .31995 


628 


ttatattttagataaggagtagcct 


atg 


taa 


100736 


96ORF004 


1 


36760. .38634 


624 


at 1 1 1 gat t gaaatgagg t gcat ac 


atg 


taa 


100737 


96ORF005 


3 


33903 . .35729 


608 


gt 1 1 a 1 1 cgaaggaaagg t ggt t ga 


ata 


taa 


100738 


96ORF006 


2 


40589. .42043 


484 


aatgatttagggtaggtgttgacca 


atg 


tag 


100739 


96ORF007 


1 


18652 . . 20091 


479 


tatacacacatactaaacctgaacg 


att 


tga 


100740 


96ORF008 


2 


8960 . . 10201 


413 


tggcagaat t tgggggcga taacg a 


atg 


tga 


100741 


96ORF009 


2 


17447. .18670 


407 


gacgcaataacggaagtgatcgtca 


at£ 


tc,a 


100742 


96ORF010 


1 


38647. .39819 


390 


t aaa t a t aaa t aaagaggt gt gt aa 


atg 


tga 


100743 


96ORF011 


-1 


119. .1195 


358 


gtagctcgcctacccttattatttt 


ttg 


tga 


100744 


96ORF012 


2 


20045. .21013 


322 


tttaatgacaaattacctgacatag 


atg 


tga 


100745 


96ORF013 


3 


29157. .30098 


313 


acttattataagggaggtttgttag 


ttg 


taa 


100746 


96ORF014 


1 


21925. .22839 


304 


agaaaataaagtgaggtaataaaac 


atg 


tag 


100747 


96ORF015 


1 


5812. .6591 


259 


atacacggtaaaggtgggagaatag 


atg 


taa 


100748 


96ORF016 


1 


7852. .8607 


251 


aat aaaatgt tgaaaggagagaaaa 


atg 


taa 


100749 


96ORF017 


3 


3444. .4190 


248 


aaatttaacattaatatcactttaa 


gtg 


taa 


100750 


96ORF018 


-3 


28281. .29000 


239 


taagctatgttgaacatcgctagtc 


atg 


tga 


100751 


96ORF019 


3 


7188. .7859 


223 


tttaccgttctaggacgtggtttaa 


atg 


taa 


100752 


96ORF02O 


3 


21324. .21908 


194 


gaagggcaaaaaggagt 1 1 1 gat at 


atg 


taa 


100753 


96ORF021 


3 


6612 . .7175 


187 


attaaaaattaattaaaaggacggt 


ata 


tag 


100754 


96ORF022 


2 


24536 . . 25093 


185 


aaagaaaaacgaaggagt gt a 1 1 aa 


atg 


taa 


100755 


96ORF023 


1 


5275. .5811 


178 


ca tga aat ggt aggaggt atgaaaa 


gtg 


tag 


100756 


96ORF024 


3 


14481. .15014 


177 


t aaaacgat aggagat aacgaat a a 


atg 


taa 


100757 


96ORF025 


2 


25157. .25666 


169 


ataaaaaaattgaaaagaggtatat 


att 


taa 


100758 


96ORF026 


-3 


15084. .15590 


168 


tcattcttaacatagcccttaattc 


atg 


tga 


100759 


96ORF027 


-1 


1229. .1732 


167 


aatagcaaataaaggagtgtaaaac 


atg 


taa 


100760 


96ORF028 


1 


16960. .17454 


164 


aaggcgtgtgatacagtgaaaacaa 


ttg 


taa 


100761 


96ORF029 


-1 


1736. .2227 


163 


tatgagaaaaggagtcatataaaag 


atg 


taa 


100762 


96ORF030 


1 


25531. .25995 


154 


1 1 1 1 c aagagggagagt cgct cgt a 


ctg 


tag 


100763 


96ORF031 


2 


23633. .24097 


154 


tttagtattgaaggtgattctgtag 


ate 


tag 


100764 


96ORF032 


-2 


2248 . .2706 


152 


ataagacaccaaaggggtttggcgc 


atg 


tga 


100765 


96ORF033 


-3 


39147. . 39605 


152 


agca t at aaa t eg 1 1 1 ag t gt 1 1 g t 


ttg 


taa 


100766 


96ORF034 


2 


13181 . . 13615 


144 


tagaagtcgaaaaagtggaggcaat 


ata 


taa 


100767 


96ORF035 


2 


10628. .11053 


141 


gagctaggattgcaagcaacgatat 


ttg 


tga 


100768 


96ORF036 


2 


24110. .24535 


141 


gtatttttcatagaggtggttaaat 


atg 


taa 


100769 


96ORF037 


1 


12583 . - 12996 


137 


atgaggaacagaagcaaccaacttt 


att 


tga 


100770 


96ORF038 


1 


15628. .16032 


134 


atgttaagaatgatgcctagtttaa 


ttg 


taa 


100771 


96ORF039 


3 


39816 . . 40220 


134 


ctaatacactttacttaattaaggg 


gtg 


taa 


100772 


96ORF04O 


-3 


27528 . . 27932 


134 


tttccataaataaacgaggacacca 


atg 


tga 


100773 


96ORF041 


3 


16206 . . 16607 


133 


gatgagggcggaggtgt cagagt ag 


atg 


tga 


100774 


96ORF042 


2 


35720. .36106 


128 


aagttactataactaaaattatggg 


gtg 


taa 


1UU / / D 


OCADPflJl 
j DUKr U ** J 


/. 


J3 f XJ > • JOVD1 


122 


tt aaacgt ccccctcagtatttgtt 


ttg 


taa 


100776 


96ORF044 


-2 


9460. .9828 


122 


agtatccatcagttgaagataatct 


ata 


taa 


100777 


96ORF045 


-3 


5139. .5504 


121 


ttctttttgtattctgtaatattca 


att 


tga 


100778 


96ORF046 


2 


11513. .11872 


119 


aagtaaatgtatagaggtggaataa 


atg 


taa 


100779 


960RF047 


2 


22991. .23350 


119 


gt cgt act aegt c tgat aagagega 


gtg 


tag 


100780 


96ORF048 


3 


8607. .8963 


118 


tggaaaaagaattgagtgatgacta 


atg 


tga 


100781 


96ORF04 9 


1 


23353. .23697 


114 


at ccgt 1 1 aaaccaat aaggt agag 


gtg 


taa 


100782 


96ORF050 


-2 


2728 . .3072 


114 


tggtaaattagtattacattaagta 


ata 


taa 


100783 


96ORF051 


3 


4692. .5021 


109 


tcaaaatatacggaggtagtcaact 


atg 


tga 


100784 


960RF052 


-1 


20882. .21211 


109 


gtagcaaagagacaactaaaaaagt 


gtg 


taa 


100785 


96ORF053 


1 


40252. .40578 


108 


acgactaattttttagtcgtttttt 


att 


tag 


100786 


96ORF054 


1 


4942. .5262 


106 


aat at aaaact aaaaaacaaaat 1 1 


atg 


ta 3 


100787 


96ORF055 


-2 


4840.. 5151 


103 


ccgtcgcaatatatagttcgcttaa 


ate - 


taa. 


100788 


96ORF056 


3 


36324 . .36623 


99 


aatttaacacaaagtaggtggcgta 


atg 


— fcaa" 


100789 


96ORF057 


2 


1394 . .1690 


98 


cttcagtggctcttttagcatttaa 


ata* 


taa 


100790 


96ORF05 8 


-3 


26247. .26537 


96 


tacttcttttctcataatctgacca 


att 


tga 


100791 


96ORF059 


-1 


21485. .21772 


95 


agactcaacgcctttttgaacatac 


ttg 


tga 


100792 


96ORP06O 


-3 


22647. .22931 


94 


cctctttgtaaccgacaagactgta 


ata 


taa 


100793 


96ORF061 


1 


14023 . . 14304 


93 


ttatctaattaagggggacgagtga 


gtg 


taa 


100794 


96ORF062 


-2 


38281. .38559 


92 


tatataacttagcgattgtacttgc 


ttg 


taa 
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100795 


96ORF063 


-3 


30786. .31064 


92 


gtctcctaatactacatcttgctta 


gtg 


tga 


100796 


96ORF064 


-2 


30205. .30480 


91 


atgcatctacttttggatgtaatac 


ata 


tag 


100797 


96ORF065 


1 


2617. .2886 


89 


aaggtctaataaaaatttctccttc 


ttg 


taa 


100798 


96ORF066 


3 


28056 . .28325 


89 


aaggtgtagtcggctggttaactga 


att 


taa 


100799 


960RF067 


-3 


17142 . .17411 


89 


1 1 ccgt cattgcgt cgtgaagt tgt 


ttg 


tga 


100600 


96ORF068 


2 


12326. .12589 


87 


aatgcatgtcgtttggtctgcctaa 


ttg 


tag 


100801 


960RF069 


2 


42734 . .42997 


87 


tttttaggcaacgatataagtaaaa 


gtg 


taa 


100802 


96ORF070 


1 


11869. .12129 


86 


aaatgttcaagaaatggagtgaagc 


ata 


taa 


100803 


96ORF071 


3 


15396. .15656 


86 


aacaagctatacaaattatcgataa 


att 


taa 


100804 


96ORF072 


-3 


37749. .38009 


86 


agattttttcgggttacccctagac 


att 


taa 


100805 


96ORF073 


3 


11244 . .11501 


65 


acatgcatatatagaggtggaataa 


atg 


tag 


100806 


960RF074 


-3 


42936. .43193 


85 


aattatttaacttactaattttctt 


ttg 


taa 


100807 


96ORF075 


-3 


26610. .26867 


65 


tactgccaatgttccatcttcaacc 


att 


taa 


100808 


96ORF076 


-1 


11126. .11380 


84 


tttatctaatacatttaagttaacc 


ate 


taa 


100809 


96ORF077 


-2 


16537.. 16791 


64 


tacccaccatataggcaggtagtag 


gtg 


tag 


100810 


96ORF078 


-3 


19521. .19775 


84 


aataactttgaatcgatacctcaac 


ata 


tga 


100811 


96ORF079 


3 


13608. .13859 


83 


ttagggcaaatggaggcagacacaa 


atg 


tag 


100812 


96ORF080 


-3 


28029. .28280 


83 


tgagaagtcgccagtaagcaactga 


att 


tga 


100813 


96ORF081 


3 


20973. .21221 


82 


aatgaagc t at cccat t catgac 1 1 


ate 


tag 


100814 


96ORF062 


-1 


8729. .8974 


81 


cgattactgtgctttcaatttcaaa 


ttg 


tga 


100815 


96ORF083 


-3 


3147. .3392 


61 


tttagcctttatataatcaacttct 


gtg 


tga 


100816 


96ORF084 


3 


1611. .1853 


80 


tgctttatctttagtttctttcttt 


ttg 


tga 


100817 


96ORF085 


-2 


29470. .29709 


79 


ctcttatcaccttcgtttgtaggca 


ate 


taa 


100818 


96ORF086 


1 


35186. .35424 


78 


gcgcaaggcgatttgggatattcaa 


ctg 


tag 


100819 


96ORF087 


-2 


13039. .13275 


78 


ttttgattgagctctaaagtgtctt 


att 


tag 


100820 


96ORF088 


3 


24930. .25163 


77 


gaac t at cat t aaaagt t aaat gga 


ata 


tga 


100821 


96ORF089 


-3 


22329. .22562 


77 


tccagtataagatagtggtaatccc 


ata 


taa 


100822 


96ORF090 


-3 


16803. .17036 


77 


acctttagtcgaataccctgcgtca 


ata 


taq 


100823 


96ORF091 


-1 


22559. .22789 


76 


aacgcttctggtttaacgttcatgt 


ata 


taa 


100624 


960RF092 


3 


18360. .18587 


75 


attgcaaaagatattgtaagtagat 


atg 
p — 


taa 


100825 


96ORF093 


-2 


25384 . .25608 


74 


catgatttccttgtaattctctttc 


ate 


taa 


100826 


96ORF094 


1 


10417. .10638 


73 


aacacacattaaggagtgttaaaaa 


ata 


tag 


100827 


96ORF095 


3 


12963. .13184 


73 


tactaaacgaagataaaactatgac 


att 


taa 


100828 


96ORF096 


1 


42994. .43212 


72 


gatcgcttgaaaacgaagaagataa 


ata 


taa 


100829 


96ORF097 


-1 


36047. .36265 


72 


tcaagcattacacctgtgacttttc 


ate 


taa 


100830 


96ORF098 


-2 


36766. .36984 


72 


caggttccggtacaaatccagatga 


ata 


taa 


100831 


96ORF099 


-2 


34765. .34983 


72 


tcattctttttataaaacgggtacc 


atg 


tag 


100832 


96ORF100 


1 


10198. .10413 


71 


acaagaagactcagaggtttttcac 


atg 


taa 


100833 


96ORF101 


1 


15208. .15423 


71 


gagaaacaagttaagataaggagag 


atg 


tga 


100834 


96ORF102 


3 


4209. .4424 


71 


a 1 1 1 1 aaa acgaaa t a t aggagagg 


ctg 


tag 


100835 


96ORF103 


3 


11673. .11866 


71 


catgcaccttatggtatgcgcttag 


ctg 


taa 


100836 


96ORF104 


3 


12117. .12332 


71 


tttacgtccaaagagcttttgactt 


gtg 


taa 


100837 


96ORF105 


3 


23892 . .24107 


71 


gatggtgggttatccagtgttataa 


gtg 


taa 


100838 


96ORF106 


-3 j 


34428. .34643 


71 


tagacctttgccaatttgttgttga 


att 


taa 


100839 


96ORF107 


-3 


24495. .24710 


71 


ggcacattaccaattgttaatttaa 


atg 


taa 


100840 


96ORF108 


-1 


23876. .24088 


70 


acatatttaaccacctctatgaaaa 


ata 


taa 


100841 


96ORF109 


-2 


17317. .17529 


70 


acctgtacgctttgctccgtgatta 


att 


taa 


100842 


96ORF110 


-3 


38931. .39143 


70 


actttcattcttttcgatgtaagaa 


atg 


taa 


100843 


960RF111 


-3 


21855. .22067 


70 


agtaaattttttcttttgtgctgtc 


att 


tga 


100844 


960RF112 


1 


3217. .3426 


69 


aaatgtcaacgggaggtgatacgaa 


atg 


taa 


100845 


960RF113 


-1 


25469. .25678 


69 


t cagggat at a t cct aaat at c t ag 


ctg 


taa 


100846 


960RF114 


-2 


9838. .10047 


69 


ataataatcatcacggtaaagtagc 


ate 


tga 


100847 


960RF11S 


1 


13819. .14022 


67 


gcagtaggggttatggcagqtcaag 


ttg 


tga 


100846 


960RF116 


-1 


41033. .41236 


67 


caacttcatgacctgcatgtcttaa 


ata 


taa 


100849 


960RF117 


-3 


24711. .24914 


67 


tctgctgtattccatttaactttta 


atg 


taa 


100850 


960RF118 


-1 


12374. .12574 


66 


tccatctcctctaaaataaagttgg 


ttg 


taa 


100851 


960RF119 


-1 


3980. .4180 


66 


ctcctatatttcgttttaaaatttc 


att 


tga 


100852 


96ORF120 


-3 


6033. .6233 


66 


ttgtaatttagaaatataacgataa 


ata 


taa 


100853 


960RF121 I 


-2 


37939. .38136 


65 


ctgaaatgccttgatacttgcctaa 


att 


tga ! 


100854 


960RF122 


2 


37892 . .38086 


64 


acgacaaaaacaacaataagaatta 


gtg 


tga 


100855 


960RF123 ! 


-3 


29193. .29387 


64 


ggacgcctgactttaaatgtgaagc 


ata 


tga 


100856 


960RF124 


1 


4408. .4599 


63 


tttatcggtaccaatttaatgatta 


atg 


taa 


100857 


960RF125 


-1 


7787. .7978 


63 


ttaaaaatccaagttttgccatcgt 


att 


tga 


100856 


960RF126 


-3 


27027. .27218 


63 


aaatttgaacaacggcattaattga 


gtg 


tga 


100859 


960RF127 


3 


1S051. .1S239 


62 


atcgagtcaaggaggttttggggaa 


gtg - 


tga 


100860 


960RF128 


-1 


6914. .7102 


62 


agcgaatgggtttgattgttgactc 


ata - 




100861 


960RF129 


-3 


31332. .31520 


62 


tcttatttgctctgcttgtctataa 


atg 


tga 


100862 


96ORF130 


-3 


30084. .30272 


62 


gaaatcatcttcaccttcaacatga 


gtg 


taa 


100863 


960RF131 


3 


11058. .11243 


61 


agaaaaagagaaatgaagtgatcta 


atg 


taa 


100864 


960RF132 


-1 


36434. .36619 


61 


taagcatggtaatcacctcctttaa 


ata 


tga 


100865 


960RF13 3 


-1 


35591. .35776 


61 


ctaaactattgcgtaaaccgccagt 


att 


taa 


100866 


960RF134 


-2 


9250. .9435 


61 


acccatgagcttataacccgtctta 


att 


tga 
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100867 


960RF135 


1 


29563. .29745 


60 


cgacaaccttttgtaggactagtaa 


gtg 


tga 


100668 


960RF136 


-3 


12486. .12668 


60 


cactttactttcaacttgttcagga 


ttg 


taa 


100869 


960RF137 


-1 


14501. .14680 


59 


caaactgaaagctaagtaatcagca 


ate 


tga 


100870 


960RF138 


-2 


23326. .23505 


59 


cttgtgacatttgatgaaattttag 


fct 9 


tga 


100871 


960RF139 


-3 


42672. .42851 


59 


aatccggaatttttagcaattttat 


ate 


taa 


100872 


96ORF140 


-3 


31137. .31316 


59 


act tgac tgactagtaaagt eg t ac 


atg 


taa 


100873 


960RF141 


-3 


18969. .19148 


59 


aacaaaaataacactatagggatct 


ata 


taa 


100874 


960RF142 


-3 


4740. .4919 


59 


cataaattttgttttttagttttat 


att 


tga 


100875 


960RF143 


2 


36107. .36263 


58 


aacaaat actgagggggacgt t c aa 


atg 


taa 


100876 


960RF144 


3 


16029. .16205 


58 


tatacgaagtaaagaaggtagataa 


ata 


tag 


100877 


960RF145 


-3 


29013 . .29189 


S8 


tgtcactgacgcgatactgtgaacc 


att 


tga 


100878 


960RF146 


-3 


14883. .15059 


58 


aatctttgaacgttgtgactaagca 


ttg 


taa 


100879 


960RF147 


-1 


18251. .18424 


57 


tatcagcgttaattgcacgtaatct 


atg 


taa 


100880 


960RF148 


-1 


13583. .13756 


57 


aataccttctttaactgaatgttga 


ata 


taa 


100881 


960RF149 


-2 


10756. .10929 


57 


taaattcacatctctatactgatat 


ctg 


tag 


100882 


96ORF150 


2 


14171. .14341 


56 


atttttaatgaagaagtgttattaa 


ctg 


tag 


100883 


960RF151 


2 


19217. .19387 


56 


cctacatactcattgcgctactttt 


atg 


tga 


100884 


960RF152 


-1 


12614. .12784 


56 


atttctacagtaaaaatatctttat 


ctg 


taa 


100885 


960RF153 


-2 


11836. .12006 


56 


ttgcattacctattgcgaatgctag 


ttg 


taa 


100886 


960RF154 


-2 


4165. .4335 


56 


atataacgcttttgtcctcgaccaa 


ate 


tga 


100887 


960RF155 


-3 


40464 . .40634 


56 


aaatcaggattgaactgcttcccta 


atg 


tga 


100888 


960RF156 


3 


423. .590 


55 


tggtaattttgataatttagcttta 


ata 


taa 


100889 


960RF157 


-1 


41879. .42046 


55 


gtagcaaaatttttattctaagtaa 


ata 


taa 


100890 


960RF158 


-2 


36166. .36333 


55 


catccatgttcgtgccgtttggtaa 


ate 


tag 


100891 


960RF159 


-2 


16228. .16395 


55 


t ttaacat ctgagcataccttt t at 


ttg 


taa 


100892 


96ORF160 


3 


1038. .1202 


54 


a t c t c t aagc agt t gt t gage ageg 


ttg 


taa 


100893 


960RF161 


-1 


19193. .19357 


54 


tctttgttgttaggtacaccaaaca 


atg 


tag 


100894 


960RF162 


-1 


18074 . .18238 


54 


ctcgtcctattaacacaatagatcc 


ata 


tga 


100895 


960RF163 


-1 


15386. .15550 


54 


agccatcataggactgtaaaattca 


ctg 


taa 


100896 


960RF164 


-1 


10049. .10213 


54 


tacatcgatttcaataagcttttga 


att 


tag 


100897 


960RF165 


-2 


18514. .18678 


54 


gtgcttcaatatcatctattaactt 


ata 


taa 


100898 


960RF166 


-2 


11104. .11268 


54 


ctagccatgattacccttaaattag 


ttg 


tag 


100899 


960RF167 


-3 


13764. .13928 


54 


agacagtttataatgtgtatctcta 


ata 


tga 


100900 


960RF168 


1 


14305. .14466 


53 


ttttgaatttttggaggacgagtaa 


atg 


tag 


100901 


960RF169 


-1 


17885. .18046 


53 


gtgttgaagccttaatagactcttt 


ata 


tga 


100902 


96ORF170 


-1 


10790. .109S1 


53 


taggcgctttacatatccacgttaa 


att 


taa 


100903 


960RF171 


-3 


12765. .12926 


53 


atcttcgtttagtatataaaacgct 


ctg 


taa 


100904 


960RF172 


3 


22836. .22994 


52 


cgttcgcaacgcttaaaccaactga 


ata 


tga 


100905 


960RF173 


-1 


15956. .16114 


52 


ctctacatcatcattagccgtcgtc 


ata 


taa 


100906 


960RF174 


-1 


10571. .10729 


52 


tagtgccattcatattactttctaa 


ata 


taa 


100907 


960RF175 


-1 


3440. .3598 


52 


cagcctatcttcactatcaacatga 


ttg 


taa 


100908 


960RF176 


-3 


37170. .37328 


52 


1 1 1 at ct aaaacat tgctgt aagca 


gtg 


taa 


100909 


960RF177 


-3 


6693. .6851 


52 


1 1 cct aat ctact aagtaactcgat 


ata 


taa 


100910 


960RF178 


-3 


5655. .5813 


52 


gacatcttgattagttttttcagtc 


ate 


tag 


100911 


960RF179 


1 


34564. .34719 


51 


gttacagctgaagtcgataaaatag 


ttg 


tag 


100912 


96ORF180 


1 


42661. .42816 


51 


atataaattctaacactaaaatact 


atg 


tga 


100913 


960RF181 


-2 


37741. .37896 


51 


t ggaegcac tgt caac t ga tgt 1 1 1 


ate 


taa 


100914 


960RF182 


-2 


25039. .25194 


51 


ttcgtaatctttttccccgtcatta 


att 


tga 


100915 


960RF183 


-2 


4534. .4689 


51 


tcagttttaatattttcagccatag 


ttg 


tga 


100916 


960RF184 


1 


6721. .6873 


50 


ggagc tggagaat 1 1 acagtaaaag 


ttg 


tag 


100917 


960RF185 


2 


36548. .36700 


50 


acaaaaatatacgegatatgaaaat 


gtg 


taa 


100918 


960RP186 


-1 


40025. .40177 


50 


tggagatcctgaataaacatcactt 


ata 


tga 


100919 


960RF187 


-1 


34466. .34618 


50 


attacctttaacaaggtcagcgcca 


ttg 


tga 


100920 


960RF188 


-1 


33842. .33994 


50 


agttcctctatctgattcatagaaa 


ctg 


taa 


100921 


960RF1B9 


-1 


24914. .25066 


50 


acatagaatggtcttccgtgtgtga 


ate 


taa 


100922 


96ORF190 


-2 


20395. .20547 


50 


tatcttagagtaaccctctccactc 


ata 


tga 


100923 


960RF191 


3 


24768. .24917 


49 


aaaggaattgaagcagtgaaacacg 


ctg 


taa 


100924 


960RF192 


-1 


16169. .16318 


49 


ttgtggtttcggcaacgttgcttgt 


atg 


tga 


100925 


960RF193 


-2 


39100. .39249 


49 


cagtaccgtttttaccgggtgcgcc 


ttg 


taa 


100926 


960RF194 


-2 


25921. .26070 


49 


t tgg t acagaegtet t tgctaat eg 


ttg 


taa 


100927 


960RF195 


-2 


17779. .17928 


49 


caaccaatgctcgggatggtcaggg 


ttg 


tga 


100928 


960RF196 


-2 


14182. .14331 


49 


ttaaatacttttcttctagcaatgc 


ate 


tga 


100929 


960RF197 


-2 


7609. .7758 


49 


ttatcatcaaacgacttaacaccaa 


ttg 


tga 


100930 


960RF198 


-2 


1537. .1686 


49 


ttattagctagtgcgttagtgttag 


gtg 


taa 


100931 


960RF199 


-3 


7719. .7868 


49 


taatacttgtatcggatagtcatct 


att ... 


taa 


100932 


96ORF200 


2 


22271. .22417 


48 


ttctttaatgaggttaaacctctaa 


ttg - 




100933 


96ORF201 


2 


30353 . .30499 


48 


t ctact attggegaaaaaataagge 


ttg 


tag 


100934 


96ORF202 


2 


32591. .32737 


48 


agattgaagcccaacggacaattta 


ttg 


taa 


100935 


96ORF203 


2 


39131. .39277 


48 


agcaaagactttaaagagaaaatag 


ata 


tag 


100936 


96ORF204 


-2 


36985. .37131 


48 


atcttcctggagaacctgtccaact 


att 


tga 


100937 


96ORF205 


-3 


38721. .38867 


48 


aaggaacccttttacaacatcgtcg 


ata 


taa 


100938 


96ORF206 


-3 


35880. .36026 


48 


gttaacatagcgttttgttgcgtca 


att 


taa 
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100939 


960RF207 


-3 


11550. .11696 


48 


ttgctctctcgctccatgattttgg 


ata 


taa 


100940 


96ORF208 


2 


37178. .37321 


47 


agattagtaagacacccttatgtaa 


gtg 


taa 


100941 


96ORF209 


2 


42341. .42484 


47 


tgcatatttaaaccacccatactag 


ttg 


taa 


100942 


96ORF210 


3 


41850. .41993 


47 


aaaggc aa t aacgt aagggacggc t 


att 


tag 


100943 


960RF211 


-1 


6662. .6805 


47 


ttgtcggaatggtgggacgaattgg 


ttg 


tga 


100944 


960RF212 


-2 


25213. .25356 


47 


agtagcacattcccaaaattgtaaa 


ate 


taa 


10094S 


960RF213 


-3 


42219. .42362 


47 


gtggtttgatcatttataatataac 


ata 


taa 


100946 


960RF214 


3 


27834. .27974 


46 


aaaaga t t t t agact t cgt t agaac 


ate 


tag 


100947 


960RF215 


3 


35811. .3S951 


46 


ttacgcaatagtttagatgtagacg 


ata 


taa 


100948 


960RF216 


-1 


5402. .5542 


46 


tttccgtaaggtgtattcaacttga 


att 


tga 


100949 


960RF217 


-2 


24229. .24369 


46 


cataggtctgttaagcacataacct 


ate 


taa 


1009S0 


960RF218 


-2 


6253. .6393 


46 


ttgtcattcttgctaacacgtcaga 


ttg 


taa 


100951 


960RF219 


1 


883. .1020 


45 


aaatcactcccgaaatattcgttaa 


ata 


taa 


100952 


96ORF220 


2 


32936. .33073 


45 


gataaaggtatagacaaagtattgt 


ate 


taa 


100953 


960RF221 


3 


41703. .41840 


45 


ggtaagcctataggtggtttggtag 


ctg 


taa 


100954 


960RF222 


-1 


39860. .39997 


45 


acttttattaggttcaactccattt 


att 


taa 


100955 


960RF223 


-1 


24716. .24853 


45 


acatttcaaatgattctggaacaac 


ata 


taa 


100956 


960RF224 


-2 


26794. .26931 


45 


caatatcacgccatgtagtttttaa 


ctg 


taa 


100957 


960RF225 


-2 


19201. .19338 


45 


caaacaatggatcgtaatcaaataa 


atg 


tga 


1009S8 


960RF226 


-2 


15709. .15846 


45 


tgactcgcttgttgtctaacacaat 


ata 


taa 


100959 


96CRF227 


-3 


36711. .36848 


45 


acattgactgccccgataattatct 


ata 


tga 


100960 


960RF228 


3 


2325. .2459 


44 


tcgccatagtgagttccaataccgt 


ata 


taa 


100961 


960RF229 


-1 


38612. .38746 


44 


ttgtcattgatacctattcttatag 


atg 


tga 


100962 


96ORF230 


-1 


31733. .31867 


44 


gctggattgtatggcttaaagtaat 


ctg 


tag 


100963 


960RF231 


-2 


12076. .12210 


44 


tgactcatagctttaacttgttcgt 


ctg 


taa 


100964 


960RF232 


-3 


31644. .31778 


44 


atagtcctcaagtgttaaccctagt 


ttg 


taa 


100965 


960RF23 3 


-3 


23988. .24122 


44 


atttgacttgtaagttcaggctcaa 


ctg 


taa 


100966 


960RF234 


-3 


17529. .17663 


44 


agtacgtttttttgaatcgtaccta 


atg 


taa 


100967 


960RF235 


1 


7153. .7284 


43 


aatgctaatggtccaatagaaatca 


atg 


tag 


100968 


960RF23 6 


2 


2681. .2812 


43 


ttctttcacttcaacttcacatttc 


ata 


tga 


100969 


960RF237 


2 


4496. .4627 


43 


g t acc atgc 1 1 cacagt ct t agcga 


ttg 


taa 


100970 


960RF23 8 


-1 


41720. .41851 


43 


cacctgtaattcttgaattagttga 


ata 


tga 


100971 


960RF23 9 


-1 


35324. .35455 


43 


acttactaataaaatagaatagttt 


gtg 


taa 


100972 


96ORF240 


-1 


8570. .8701 


43 


atccccgttttgacttaatacatca 


ate 


tga 


100973 


960RF241 


-2 


33502. .33633 


43 


ataactctgtaatactcttagggat 


atg 


tag 


100974 


960RF242 


-2 


23662. .23793 


43 


agctaatgctacagcagtgttgtaa 


ate 


tag 


100975 


960RF243 


-3 


32391. .32522 


43 


acctggacgagcttgcgtcatataa 


ata 


tag 


100976 


960RF244 


-3 


30273. .30404 


43 


aaaaccttcgttatactcttggtaa 


ate 


tga 


100977 


960RF245 


-3 


5895. .6026 


43 


tgcactaaaatgcttataattctta 


ate 


taa 


100978 


960RF246 


-3 


2679. .2810 


43 


a 1 1 cat caagaaact at agccggt c 


atg 


tga 


100979 


960RF247 


1 


34891. .35019 


42 


acaccaagcaaatctggtgtgttag 


ttg 


taa 


100980 


960RF248 


2 


30668. .30796 


42 


aattattacattaaagctggtgtga 


atg 


tag 


100981 


960RF24 9 


2 


31838 . .31966 


42 


caaacattagcttgtagtgagttag 


atg 


taa 


100982 


96ORF250 


2 


33539. .33667 


42 


cttaccagaaacagcacaggtagaa 


ata 


taa 


100983 


960RF251 


-1 


20486. .20614 


42 


cttctgtacgagccacacgcaatga 


ttg 


tag 


100984 


960RF252 


-1 


15128. .15256 


42 


gatatctcattactagctactacta 


ata 


tga 


100985 


960RF253 


-2 


41446. .41574 


42 


aaaacctaattcagataaacgataa 


teg 


tga 


100986 


960RF2S4 


-2 


41005. .41133 


42 


gttacaaccatgaccggctacaagc 


ata 


taa 


100987 


960RF255 


-2 


23008. .23136 


42 


aggataaatgacttgaccatctttc 


ata 


taa 


100988 


960RF2S6 


-2 


14794. .14922 


42 


ttgtatgcgtcaatgagttggtcga 


ttg 


tag 


100989 


960RF257 


-2 


8503.. 8631 


42 


tacctaacttttttaataatttcta 


atg 


tga 


100990 


960RF258 


-3 


22143. .22271 


42 


aaacgctttgtaaaatgcctctgca 


att 


tga 


100991 


960RF259 


-3 


18639. .18767 


42 


cttgtatctattatagagattaacc 


att 


tag 


100992 


96ORF260 


-3 


15624. .15752 


42 


gttttggtaactagccactgtatag 


ata 


taa 


100993 


960RF261 


2 


18746. .18871 


41 


cataccgaggctctaatagagtcac 


ata 


taa 


100994 


960RF262 


-1 


13067. .13192 


41 


aattaattaattcttctcttgttgg 


ttg 


taa 


100995 


960RF263 


-2 


18742. .18867 


41 


taacagacacgtctaatcgccttac 


att 


tga 


100996 


960RF264 


-2 


18376. .18501 


41 


catattatcataaagaacaagtaac 


ttg 


taa 


100997 


960RF265 


-2 


367. .492 


41 


ctaaacgaaaaagagggtacaatac 


ate 


tga 


100998 


960RF266 


-3 


32802. .32927 


41 


aggtacatccatttgatacaatact 


ttg 


taa 


100999 


960RF267 


-3 


10194. .10319 


41 


atcatcgaaaggcgataactcgtta 


ttg 


tga 


101000 


960RF268 


1 


1159. .1281 


40 


ttattcttcctttttgtaattgtaa 


atg 


taa 


101001 


960RF269 


2 


10373. .10495 


40 


gacagagttgaaaagaaaatcatga 


atg 


taa 


101002 


96ORF270 


2 


15734. .1S856 


40 


ttattcggcgtaatcgcactgatgc 


ttg 


tag 


101003 


960RF271 


-1 


43451. .43573 


40 


c c tUo shine-dalgarno 
sequence 


. att * 




101004 


960RF272 


-1 


36959. .37081 


40 


acgctataaaaataacttttattag 


at^ 


tag 


101005 


960RF273 


-1 


35798. .35920 


40 


ctgacgcactttgttggtttgatgc 


att 


taa 


101006 


960RF274 


-1 


8147. .8269 


40 


tctgtctctctatgtttgttagtct 


ctg 


tga 


101007 


960RF275 


-2 


43066. .43188 


40 


tttaacttactaattttcttttgat 


ata 


tga 


101008 


960RF276 


-2 


42535. .42657 


40 


aaataacgtaaattgttttcatagt 


att 


tag 


101009 


960RF277 


-2 


30628. .30750 


40 


tttgtagtcccgcttctgcaaaagt 


ctg 


taa 
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101010 


960RF27B 


-2 


13291. .13413 


40 


ttcgtatcttccaagcaattcattt 


ttg 


tga 


101011 


960RF279 


-2 


3172. .3294 


40 


cagattgtttagtaacgcctaattt 


ate 


taa 


101012 


960RF280 


-3 


16804 . .18926 


40 


taaataaccaacacgtgtatcaaca 


att 


tag 


101013 


960RF281 


-3 


1S843. .15965 


40 


atttaaaaagtgtattctataacca 


ate 


tag 


101014 


960RF282 


-3 


8460. .8582 


40 


ttagtcatcactcaattctttttcc 


att 


taa 


101015 


960RF283 


-3 


7593. .7715 


40 


gatgttgtctacacagtgctaacac 


atg 


taa 


101016 


960RF284 


-3 


6453. .6575 


40 


aat t aat 1 1 1 1 aac taccat 1 1 ct a 


att 


tga 


101017 


960RF285 


1 


15082. .15201 


39 


caatacttagtcacaacattcaaag 


att 


taa 


101018 


960RF286 


1 


34444 . .34563 


39 


acacaaacgttaatagcaaaagcga 


atg 


tag 


101019 


960RF287 


2 


27920. .28039 


39 


cctattttagcagttgttgcagtaa 


ttg 


tag 


101020 


960RF268 


2 


28415. .28534 


39 


atcggccttttaactggcgtaatga 


ate 


tag 


101021 


960RF289 


2 


38147. .38266 


39 


tatcaaatgcttaatttaggcaagt 


ate 


tga 


101022 


960RF290 


3 


40917. .41036 


39 


gcaaatttaaacactttcacatcat 


atg 


taa 


101023 


960RF291 


-2 


38815. .38934 


39 


tctctaaaaacagcttacagcgaac 


ata 


taa 


101024 


960RF292 


-2 


32671. .32790 


39 


ctataggattataaatcgctgacgt 


ata 


tga 


101025 


960RF293 


-2 


31216. .31335 


39 


ttgatttgatgtttcttatacttga 


ttg 


taa 


101026 


960RF294 


-2 


21589. .21708 


39 


gtatcttcatcagaatcgcctaaaa 


ate 


taa 


101027 


960RF295 


-2 


18976. .19095 


39 


tatcaatatatgctaacctagcacc 


ata 


taa 


101028 


960RF296 


-2 


11482. .11601 


39 


gccacctcgtactctttttgcaacc 


att 


taa 


101029 


960RF297 


-3 


12933. .13052 


39 


tcacgaaataatgtttctttaattt 


ata 


taa 


101030 


960RF298 


-3 


8262. .8381 


39 


gaactgatcttgcttaaatgattta 


att 


tag 


101031 


960RF299 


-3 


6993. .7112 


39 


cattagcattagcgaatgggtttga 


ttg 


tga 


101032 


96ORF300 


2 


23516. .23632 


38 


actacatctgaacaactaaaatttc 


ate 


tag 


101033 


96ORF301 


2 


25943. .26059 


38 


agat t agaagaagaaaaaagaagac 


gtg 


taa 


101034 


96ORF302 


2 


36929. .37045 


38 


tattggggttttgtaacatggggca 


atg 


tag 


101035 


96ORF303 


3 


4476. .4592 


38 


ataaaagctacctagtagcagtact 


atg 


tga 


101036 


96ORF304 


3 


20586. .20702 


38 


tactctaagatagctaaagcaatac 


gtg 


tga 


101037 


96ORF305 


3 


28356. .28472 


38 


cggttaccaatgtgcttgatacgat 


ttg 


taa 


101038 


96ORF306 


-1 


24359. .24475 


38 


acttaaataaaagccgtatcgtgcc 


atg 


taa 


101039 


96ORF307 


-1 


20147. .20263 


38 


ttgtacctatacgagttaactcctt 


att 


tag 


101040 


96ORF308 


-2 


38158. .38274 


38 


ttccgtatccactttctaagaaagc 


gtg 


tga 


101041 


96ORF309 


-2 


35149. .35265 


38 


agcttgtttgtatcgtctttaacga 


ata 


taa 


101042 


96ORF310 


-2 


31423. .31539 


38 


gtaatatgattaggtctcctcttat 


ttg 


taa 


101043 


960RF311 


-2 


10438. .10554 


38 


cgcct 1 1 aaat cgt 1 1 1 aggt cact 


ate 


taa 


101044 


960RF312 


-2 


1390. .1506 


38 


gagaacaacacaaacactaacaaca 


ate 


taa 


101045 


960RF313 


-3 


33051. .33167 


38 


acgtcctgtttctagatcgtaatac 


ata 


tag 


101046 


960RF314 


-3 


25194. .25310 


38 


agcaaaccgt t aaagat aacat t ga 


ate 


taa 


101047 


960RF315 


-3 


6273.-6389 


38 


cattcttgctaacacgtcagattga 


ctg 


tga 


101048 


960RF316 


-3 


4281. .4397 


38 


ataatccgtattcattaatcattaa 


att 


tag 


101049 


960RF317 


1 


2260. .2373 


37 


atgactccttttctcatatttcttt 


ata 


taa 


101050 


960RF318 


2 


21230. .21343 


37 


atttcacacttttttagttgtctct 


ttg 


taa 


101051 


960RF319 


3 


18018. .18131 


37 


atactgagtcaccaatttaagctcg 


atg 


tag 


101052 


96ORF320 


3 


36972. .37085 


37 


attacagatatcctaagggtttccg 


att 


taa 


101053 


960RF321 


-1 


36302 . .36415 


37 


ctcttgagttttttgacctaattta 


ate 


taa 


101054 


96QRF322 


-1 


32606. .32719 


37 


ccataagttatttctccagttctat 


att 


taa 


101055 


960RF323 


-1 


11453 . .11566 


37 


ttaaaccgttcttttttatcaattc 


att 


tga 


101056 


960RF324 


-1 


7268.. 7381 


37 


tactggttcgccccagtgaagttct 


ata 


tga 


101057 


960RF325 


-2 


32347. .32460 


37 


ttactgcatttgtatatggcgataa 


ate 


tag 


101058 


960RF326 


-2 


24682. .24795 


37 


acgtttattacgctcataaagccat 


ata 


tag 


101059 


960RF327 


-2 


23905. .24018 


37 


aaatggctgtggcgcttgaccatat 


gtg 


taa 


101060 


960RF328 


-2 


21460. .21573 


37 


agagcactaatacgtttttgttctt 


ctg 


tga 


101061 


960RF329 


-2 


21208. .21321 


37 


gacttaacttcttcgatattcatat 


ate 


tga 


101062 


96ORF330 


-2 


18085. .18198 


37 


ccagccgacaccagcaaagtatcct 


ttg 


tag 


101063 


960RF331 


-2 


8170. .8283 


37 


accttgagacgtcgtctgtctctct 


atg 


tag 


101064 


960RF332 


-2 


5971. .6084 


37 


caatttgttttccgttttctcttag 


ttg 


tag 


101065 


960RF333 


-3 


37632. .37745 


37 


acct tgcttaatcaagtcgt aatt a 


att 


tga 


101066 


960RF334 


-3 


29628. .29741 


37 


ctgagttagtgttgtaaaatgtcat 


ttg 


tag 


101067 


960RF335 


-3 1 


7164 . .7277 


37 


ttagcggatatccgttttctagtaa 


ate 


taa 


101068 


960RF336 


1 


22903. .23013 


36 


gtaaaaaaagacaatatgactatta 


ctg 


tga 


101069 


960RF337 


1 


43258. .43368 


36 


taattgacgtggttattttttaggt 


ttg 


taa 


101070 


960RF338 


2 


12668. .12778 


36 


gaactggtggaatgggcatggaaca 


ate 


tag 


101071 [ 


960RF339 


2 


28292. .28402 


36 


ttcactgctttaattcagttgctta 


ctg 


taa 


101072 


960RF340 


2 


35396 . .35506 


36 


ttcctaatgaacataagtcaacggt 


att 


tga 


101073 


960RF341 


3 


25428.. 25538 


36 


act cgagaacaat t agaaaaagcaa 


ttg 


tga 


101074 


960RF342 


-1 


40913. .41023 


36 


tatctgggaaatttaatctaataaa 


ata ... 


tga . 


101075 


960RF343 


-1 


39173. .39283 


36 


tgccacattttagtgtcaggattga 


ttg 


ItaU 


101076 


960RF344 


-1 


37580. .37690 


36 


gggtctacctttaacgtcgtttcag 


ata 


taa 


101077 


960RF345 


-1 


31556. .31666 


36 


ggatcattctttctaataacttcaa 


ttg 


tga 


101078 


960RF346 


-1 


29972. .30082 


36 


ggctactccttatctaaaatataat 


ttg 


taa 


101079 


960RF34 7 


-1 


28787. .28897 


36 


ctgccaaagtctgtagcaattactt 


ttg 


tga 


101080 


960RF348 


-1 


21839. .21949 


36 


ttaaaatccgataaaat aacat tgc 


ctg 


cga 


101081 


960RF34 9 


♦1 


3647. .3757 


36 


t aaaac 1 1 ccgaagt t acccagcgt 


ttg 


tga 
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101082 


96ORF350 


-2 


40801. .40911 


36 


accattccaattttgcccatatgat 


gtg 


tag 


101083 


960RF351 


-2 


38953. .39063 


36 


tatcttttaaaattctcgtaatagc 


ate 


taa 


101084 


960RF352 


-2 


31585. .31695 


36 


tagctgtcatcactagtatttttga 


ate 


taa 


101085 


960RF353 


-2 


24S50. .24660 


36 


atagtccgttttaccgcctcgtact 


att 


tag 


101086 


960RF354 


-2 


20083. .20193 


36 


atcatcattttgatatttctcaaac 


ata 


tga 


101087 


960RF355 


-2 


991. .1101 


36 


gcatcttggcagtacgacgtaaaac 


ate 


tag 


101088 


960RF356 


-3 


38148. .38258 


36 


taagaaagcgtgcgcgatcaaataa 


att 


tga 


101089 


960RF357 


-3 


8790. .8900 


36 


tgaagttatctagcgctatttttct 


ttg 


tag 


101090 


960RF358 


-3 


4458. .4568 


36 


ttcataaaagtattctttgtagtat 


atg 


tag 


101091 


960RF359 


1 


4666. .4773 


35 


ttatcaaaatatacaacttaattaa 


ate 


tag 


101092 


960RF360 


1 


11569. .11676 


35 


at aaac 1 1 accgaacatgaaaa t ga 


att 


tga 


101093 


960RF361 


2 


6122. .6229 


35 


ggaaaacaaattgatgttgtagtga 


ttg 


taa 


101094 


960RF362 


-1 


40418. .40525 


35 


ttcgtaggtgtcattacttctttaa 


ttg 


tag 


101095 


960RF363 


-1 


34358. .34465 


35 


gttttgcttgatttcgatttgttga 


atg 


tga 


101096 


960RF364 


-1 


20654 . .20761 


35 


ctatttccactgattccccatctaa 


atg 


tga 


101097 


960RF365 


-1 


8423. .8530 


35 


t ct t t tt t agagt t acgaggt t t ca 


att 


tag 


101098 


960RF366 


-1 


2402. .2509 


35 


tgacgtatggcaacattttagatca 


ate 


taa 


101099 


960RF367 


-2 


36607. .36714 


35 


aaaataaaaagccagtgccgaagca 


ctg 


tag 


101100 


960RF368 


-2 


27061 . .27168 


35 


caaatcgtcctgcagcgttcaataa 


ate 


tag 


101101 


960RF369 


-2 


26470. .26577 


35 


atgagt tgt t aagt t C accccaaat 


ate 


taa 


101102 


96ORF370 


-2 


10327. .10434 


35 


ccgtgccatcttctcggtataagta 


ata 


taa 


101103 


960RF371 


-2 


8650. .8757 


35 


gggtacgggttgttactgttgatat 


ate 


taa 


101104 


960RF372 


-3 


14382. .14469 


35 


gctcttttaattgatctactgttaa 


att 


taa 


101105 


960RF373 


-3 


8151. .8258 


35 


atgtttgttagtctctgtgtagtct 


atg 


taa 


101106 


960RF374 


-3 


5007. .5114 


35 


aaacgatttaagtggaacattattc 


ata 


taa 


101107 


960RF375 


2 


30563. .30667 


34 


cgattagaaatctttaaaaaaggac 


ttg 


tga 


101108 


960RF376 


-1 


19916. .20020 


34 


tctatgtcaggtaatttgtcattaa 


att 


taa 


101109 


960RF377 


-1 


9236. .9340 


34 


cttttctgttagtaattgtttttaa 


ate 


taa 


101110 


960RF378 


-1 


9026. .9130 


34 


actctttatctttagttgcttttaa 


ata 


tag 


101111 


960RF379 


-2 


28447. .28551 


34 


cttttgtgataataaagtttagtgc 


ttg 


tga 


101112 


96ORF380 


-3 


40329. .40433 


34 


ccatttaccttcttgagatgttgga 


"9 


tga 


101113 


960RF381 


-3 


39801. .39905 


34 


caaaagatgaaggctttttccatac 


ttg 


taa 


101114 


960RF382 


-3 


33831. .33935 


34 


atgt tgt ttgtaactcgatt aagt t 


ate 


tga 


101115 


960RF383 


-3 


33687. .33791 


34 


gttattacgtcttaatacttgtgtt 


gtg 


tag 


101116 


960RF384 


-3 


13530. .13634 


34 


tatacgcactagtactgatcactga 


ttg 


taa 


101117 


960RF385 


-3 


3843. .3947 


34 


tttgattgattgttctagttaagaa 


att 


taa 


101118 


960RF386 


1 


12256. .12357 


33 


agtcataaagaagttagcaatgtga 


ttg 


tag 


101119 


960RF387 


2 


2207. .2308 


33 


tccaagactctttaactgttaactt 


ate 


tag 


101120 


960RF388 


2 


2519. .2620 


33 


attgttgaatttcgattgatctaaa 


atg 


tga 


101121 


960RF389 


2 


22517.. 22618 


33 


agaagtaaaatgcgtaatgctttag 


atg 


tag 


101122 


96ORF390 ! 


2 


27302. .27403 


33 


ttccaaaattgggctaatagtgtag 


ctg 


taa 


101123 


960RF391 


2 


32384 . .32465 


33 


actaaaaaggttgagaaagctgtag 


atg 


taa 


101124 


960RF392 


2 


39287. .39386 


33 


aaaaacggtactgtagtatcaatca 


ate 


tag 


101125 


960RF393 


3 


18153 - .18254 


33 


gtagtatatgccgactttgatttga 


atg 


taa 


101126 


960RF394 


3 


24189. .24290 


33 


tcagaccctaacattaacaaactag 


ttg 


tga 


101127 


960RF395 


-1 


15266. .15367 


33 


tcgataatttgtatagcttgtttta 


atg 


tag 


101128 


960RF396 


-2 


32239. .32340 


33 


ttttagtgaaagcatctagtgttga 


ata 


tag 


101129 


960RF397 


-2 


16123. .16224 


33 


ttatgtgtgcctatcatattaacaa 


ttg 


tag 


101130 


960RF3 98 


-2 


13648. .13749 


33 


tctttaactgaatgttgaatagcat 


ttg 


tag 


101131 


960RF399 


-2 


10987. .11088 


33 


acttctgtaggtattcttatatcaa 


ttg 


tga 


101132 


96ORF4 00 


-2 


3382.-3483 


33 


cttactggtaattcttcaaaattaa 


atg 


taa 


101133 


96ORF401 


-3 


40794. .40895 


33 


ccatatgatgtgaaagtgtttaaat 


ttg 


taa 1 


101134 


96ORF402 


-3 


39978. .40079 


33 


atattcctaaatcacttgaacctaa 


att 


tga 


101135 


96ORF403 


-3 


38607. .38708 


33 


atcttcagtgtaaaatcgacagcca 


atg 


tag 


101136 


96ORF4 04 


-3 


21288. .21369 


33 


cagacaccgtcttaagtccctttag 


ata 


taa 
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Table 11 



SEQUENCE INFORMATION FOR PHAGES MATCHING WITH TABLE 1 

M32695 

Bacteriophage PM2 nuclease cleavage site 
gi|166145|gb|M32695|BM2NCS [166145] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link, or 1 nucleotide neighbor ) 
M32693 

Bacteriophage PM2 Hind HI fragment 4 
gi|166144|gb|M32693|BM24HIND3 [166144] 

(View GenBank report,FASTA report^SN.l rcport,Graphical view.l MEDLINE link, or 1 nucleotide neighbor ) 
M32693 

Bacteriophage PM2 Hind III fragment 4 
gi|166144|gb|M32693|BM24HIND3 [166144] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l MEDLINE link, or 1 nucleotide neighbor ) 
M32694 

Bacteriophage PM2 Hind HI fragment 3 
gi|l66l43igb|M32694|BM23HIND3 [166143] 

(View GenBank report,FASTA report^SN.l report,Graphical view, or 1 MEDLINE link ) 
M26134 

Bacteriophage PM2 structural protein gene containing purine/pyrimidine rich 
regions and anti-2-DNA-IgG binding regions, complete cds 
gi|289360|gb|M26134|BM2PROTIV [289360] 

(View GenBank rcport,FASTA report^SN. I report,Graphical view, 1 MEDLINE link, or 1 protein link ) 
J02452 

bacteriophage fi 3'-terrninal region ma 
gi|2l5409|gb|J02452|PFITR3 [215409] 

(View GenBank report,FASTA reporC\SN. 1 report,Graphical view, or 1 MEDLINE link ) 
AF020798 

Bacteriophage Chpl genome DNA, complete sequence 
gi|2 17761 |dbj|D00624|BCPi [217761] 

(View GenBank report,FASTA repor£ASN.l report,Graphical view.l MEDLINE link, 12 protein links, or I genome link ) 
X72793 

Clostridium botulinum C phage BONT/C1, ANTP-139, ANTP-33, ANTP-17, ANTP-70 
genes and ORF-22 

gi|5i6171|emb|X72793|CBCBONT [516171] 

(View GenBank report,FASTA reporVlSN.l report,Graphical view,l MEDLINE link, 6 protein links, or 4 nucleotide neighbors ) 
X51464 

Clostridium botulinum D Phage C3 gene for exoenzyme C3 
gi|14907|emb|X51464|CBDPE3 [14907] 

(View GenBank report,FASTA report^VSN.l report,Graphical view, I MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
D90210 

Bacteriophage c-st (from C. botulinum) Cl-tox gene for botulinum CI neurotoxin 
gi|217780|dbj|D902l0|CSTClTOX [217780] 

(View GenBank rcport,FASTA report^\SN.l report,Graphicai view,l MEDLINE link, or 1 protein link ) - * r 
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S49407 

type D neurotoxin (bacteriophage d-16 phi, host *= C. botulinum, type D, CB16, Genomic, 4087 nt] 
gi|260238[gb|S49407|S49407 [260238] 

(View GcnBank report,FASTA report, ASN.l report,Graphical view,! MEDLINE link, or 1 protein link ) 
X53370 

Bacteriophage phi29 temperature sensitive mutant TS2(98) DNA polymerase gene 
gi| i 5733|emb|X533 70|POTS298 [ 1 5733] 

(View GenBank reportJASTA reportASN.l report,Graphical view,l MEDLINE link, 1 protein link, or 7 nucleotide neighbors ) 
X53371 

Bacteriophage phi29 temperature sensitive mutant TS2(24) DNA polymerase gene 
gi|l 573 l|emb|X5337l|POTS224 [15731] 

(View GenBank report,FASTA report,ASN,l report.Graphical view,l MEDLINE link, 1 protein link, or 7 nucleotide neighbors ) 
X05973 

Bacteriophage phi29 prohead RNA 
gi|l5680|emb|X05973|POP29PRO [15680] 

(View GenBank report^ASTA reportASN.l report,Graphical view,2 MEDLINE links, or 4 nucleotide neighbors ) 
V01155 

Left end of bacteriophage phi-29 coding for 15 potential proteins Among 

these are the terminal protein and the proteins encoded by the genes 1 , 2 (sus), 3, and (probably) 4 

gi|15659|emb|V01155|POP29B [15659] 

(View GenBank report,FASTA report^SN.1 report,Graphical view.l MEDLINE link, 16 protein links, or 16 nucleotide neighbors) 
X73097 

Bacteriophage phi-29 left origin of replication 
gi|3 12 194|ernb|X73097|BP29ORIL [3 12194] 

(View GenBank report,FASTA reportASN.l report,Graphical view,l MEDLINE link, or 5 nucleotide neighbors ) 
M14430 

Bacteriophage phi-29 gene- 17 gene, complete cds 
gi|215321|gb|M14430|P29G17A [215321] 

(View GenBank report,FASTA reportASN.l report,Graphical view,l MEDLINE link, 6 protein links, or 8 nucleotide neighbors ) 
M14431 

Bacteriophage phi-29 gene- 16 gene, complete cds 
gi|2l5319|gb|M14431|P29G16A [215319] 

(View GenBank report,FASTA reportASN.l report,Graphical view,l MEDLINE link, 2 protein links, or 7 nucleotide neighbors ) 
M20693 

Bacteriophage phi-29 DNA, 3* end 
gi|215343|gb|M20693|P29REPINB [215343] . 

(View GenBank report.FASTA reportASN.l repor^Graphical view.l MEDLINE link, or 4 nucleotide neighbors ) 
M21016 

Bacteriophage phi-29. DNA, 5* end 
gi|215342|gb|M210l6|P29REPINA [215342] 

(View GenBank reportJASTA report^SN.1 report,Graphical view.l MEDLINE link, or 1 nucleotide neighbor ) 
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M 12456 

Bacteriophage phi-29 genes 9, 10 and 1 1 encoding p9 tail, incomplete, plO 
connector, complete, and pi 1 lower collar* incomplete, respectively 
gi|215338|gb|M12456|P29P9 (215338] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE link, 3 protein links, or 2 nucleotide neighbors ) 
M 14782 

Bacillus phage phi-29 head morphogenesis, major head protein, head fiber 
protein, tail protein, upper collar protein, lower collar protein, pre-neck* 

appendage protein, morphogenesis(13), lysis, morphogcnesis(15), encapsidarion genes, complete cds 
gi|215323|gb|M14782|P29LATE2 [215323] 

(View GenBank report,FASTA report,ASN.l report,GraphicaI view,l MEDLINE link, 1 1 protein links, or 1 1 nucleotide neighbors) 
M26968 

Bacteriophage phi-29 (from Bacillus subtilis) proteins pi delta- 1 genes, complete cds, and the susl(629) mutation 
gi|34 1 558|gb|M26968|P29P 1 D 1 A [34 1 55 8] 

(View GenBank report,FASTA report,ASN.l report, Graphical view,l MEDLINE link, 2 protein links, or I nucleotide neighbor ) 
J02448 

Bacteriophage fl , complete genome 
gi|166201|gb|J02448|FlCCG [166201] 

(View GenBank rcport,FASTA report,ASN.l report,Graphical view.l MEDLINE link, 10 protein links, 205 nucleotide neighbors, 
or 1 genome link ) 

M24832 

Bacteriophage f2 coat protein gene, partial cds 
gi| 1 66228|gb|M24832|F2CRNACA [ 1 6622 8] 

(View GenBank report,FASTA report,ASN.l report f Graphical view.l MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
J02451 

Bacteriophage fd, strain 478, complete genome 
gi|215394|gb!J02451|PFDCG [215394] 

(View GenBank repon,FASTA report,ASN.l report,Grapbical view,5 MEDLINE links, 10 protein links, 204 nucleotide neighbors, 
or 1 genome link ) 

M34834 ' 
Bacteriophage fr replicase gene, 5' end 
gi| 1 66 1 39|gb|M34834|BFRREGRA [166139] 

(View GenBank report,FASTA report^SN.l report, Graphical view.l protein link, or 9 nucleotide neighbors ) 
M38325 

Bacteriophage fr replicase gene, 5' end 
gi|166l37|gb|M38325|BFRREGR [166137] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l protein link, or 9 nucleotide neighbors ) 
M35063 

Bacteriophage fr coat protein replicase cistron (R region) RNA 
gi| 1 66 1 34|gb|M35063|BFRRCRRA [1661 34] 

(View GenBank report,FASTA report, ASN.l report,Graphical view, 1 protein link, or 3 nucleotide neighbors ) 
S66567 

alpha-atrial natriuretic factor/coat protein=tusion polypeptide [human, 
bacteriophage fr, expression vector pFAN15, Piasmio^yntheticRecombinant, 510 nt] 
gi|435742|gb|S66567|S66567 [435742] 

(View GenBank report,FASTA report ASN.l report,Graphical vicw,l MEDLINE link, 1 protein link, or 15 nucleotide neighbors ) 
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X15031 

Bacteriophage fr RNA genome 

git 1 507 1 |emb|X 1 503 1 |LEBFRX [15071} 

(View GenBank report,FASTA report.ASN.l repon,Graphical view.l MEDLINE link, 4 protein links, 9 nucleotide neighbors, 
or 1 genome link ) 

U51233 

Mus musculus neutralizing anti-RNA-bacteriophage fr immunoglobulin variable 
region light chain (IgM) mRNA, partial cds 
gt| 1 277 1 50|gb|U5 1 233|MMU5 1 233 [ 1 277 1 50) 

(View GenBank repon,FASTA report,ASNM report,GraphicaI view, I protein link, or 1669 nucleoridc neighbors ) 
U51232 

Mus musculus neutralizing anti-RNA-bacteriophage fr immunoglobulin variable region heavy chain (IgM) mRNA, partial cds 
gi| 1277 148|gb|U5 1 232|MMU5 1 232 [1 277 1 48] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l protein link, or 1073 nucleotide neighbors ) 
U02303 

Bacteriophage Ifl, complete genome 
gi|3676280|gb|U02303|B2U02303 [3676280] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, 10 protein links, or 1 genome link ) 

V00604 
Phage M 13 genome 

gqi4959|emb|V00604|INM13X [14959] 

(View GenBank report,FASTA reportrASN. 1 report,Graphical view.l NIEDLINE link, 10 protein links, or 205 nucleotide 
neighbors ) 

A32252 

Synthetic bacteriophage Ml 3 protein III probe 
gi|1567340|emb|A32252|A32252 [1567340] 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 
A32251 

Synthetic bacteriophage Ml 3 protein III probe 
gi|1567339|emb|A32251|A32251 [1567339] 

(View GenBank report,FASTA reportASN.l report, or Graphical view) 
Ml 2465 

Bacteriophage M13 mplO mutations in lac operon 
gii2l52l0|gb|M12465|M13LACMlTT [215210] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view.l MEDLINE link, or 215 nucleotide neighbors ) 
M24177 

Synthetic Bacteriophage M13 (clone M13.SV.B12) SV40 early promoter region DNA 
gi|2094 1 6|gb|M24 1 77|S YNS VB 1 2 [2094 1 6] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view.l MEDLINE link, or 1 nucleotide neighbor ) 
M24176 

Synthetic Bacteriophage M13 (clone M13.SV.B1 1) SV40 early promoter region DNA 
gi|2094l5|gb|M24176!SYNSVBl 1 [209415] 

(View GenBank repon,FASTA report,ASN. 1 report,Graphical view.l MEDLINE link, or 1 nucleotide neighbor ) 
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M24175 

Synthetic Bacteriophage M13 (clone M13.SV.8) SV40 carlv promoter recion DNA 
gi|208806|gb|M24 1 75|S YNM 1 3S V8 [208806] 

(View GenBank report,FASTA report, ASN.l report,Graphical view.l MEDLINE link, or 242 nucleotide neighbors ) 
M19979 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207813|gb|M19979|SYN33M!3M [207813] 

(View GenBank report,FASTA report,ASN.l report.Graphical view,! MEDLINE link, or 617 nucleotide neighbors ) 
M19565 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207808|gb|M19565|SYN33M13H [207808] 

(View GenBank report,FASTA report f ASN.l repontGraphical viewj MEDLINE link, or 567 nucleotide neighbors ) 
Ml 9564 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207807|gb|M19564|SYN33M13G [207807] 

(View GenBank repon,FASTA report^ASN.l report,Graphical view, I MEDLINE link, or 12 nucleotide neighbors ) 
Ml 9563 

Synthetic hybrids; recombinant DNA from bacteriophaee M13 and plasmid pHV33 
gi|207806|gb|M19563|SYN33M13F [207806] 

(View GenBank report,FASTA report, ASN. 1 report, Graphical view,l MEDLINE link, or 262 nucleotide neighbors ) 
M19561 

Synthetic hybrids; recombinant DNA from bacteriophage M 13 and plasmid pHV33 
gi|207804|gb|M19561|SYN33M13D [207804] 

(View GenBank report JASTA report^SN.l report, Graphical view, I MEDLINE link, or 27 nucleotide neighbors ) 
M19560 

Synthetic hybrids; recombinant DNA from bacteriophage M 1 3 and plasmid pHV33 
gi|207803|gb|MI9560|SYN33M13C [207803] 

(View GenBank report f FASTA report,ASN. 1 report,Graphical view, or 1 MEDLINE link ) 
M19559 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207802|gb|M 1 9559|SYN33M 1 3B [207802] 

(View GenBank report,FASTA report^SN.l report,Graphicai view t l MEDLINE link, or 227 nucleotide neighbors ) 
M10568 

Bacteriophage Ml 3 replicative form II, replication origin, specific nick location 
gi|215220|gb|M10568|M13ORIB [215220] 

(View GenBank report^ASTA repor^ASN.l report,Graphical view.l MEDLINE link, or 650 nucleotide neighbors ) 
M10910 

Bacteriophage M13 gene II regulatory region and M13sji mutant 
gi|215209|gb|M10910|M13IIREG [215209] 

(View GenBank report JASTA report,ASN. 1 repon,Graphical view, 1 MEDLINE link, or 72 nucleotide neighbors ) ■ 
M38295 

Bacteriophage M13 Haelll restriction fragment DNA 
gi|215208|gb|M38295|M13HAJEni [215208] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, or 67 nucleotide neighbors ) 
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E02067 

DNA cncodine a pan of Bacteriophage M13 te 127 
gi|2 170311 |dbjIE02067|E02067 [217031 1) 

(View GenBank report.FASTA reportASN. I report, or Graphical view) 
J02467 

Bacteriophage MS2, complete genome 
gi)2 i5232|gb|J02467|MS2CG [215232] 

(View GenBank report.FASTA repoaASNM rcport,Graphical view,8 MEDLINE links, 4 protein links, 20 nucleotide neighbors 
or 1 genome link ) .» * 

AJ004950 

Bacteriophage PI ban gene 
gi|3688226jemb|AJ01 !592|BP101 1592 [36882261 

(View GenBank report.FASTA repon,ASNM report,Graphica! view, or 1 protein link) 
U88974 

Bacteriophage PI structural lyric transgiycosylase (orf47), pep44b (orf44b), 

pep44a (orf44a), and pep43 (orf43) genes, complete cds; and pep42 (orf42) gene, partial cds 

gi|2661099|gb|AF035607|AF035607 [2661099] 

(View GenBank repon,F ASTA report,ASN. 1 report,Graphical view,5 protein links, or 1 nucleotide neighbor ) 

AJ000741 

Bacteriophage PI darA operon 
gi|2462938|emb|AJ00074l|BPAJ7641 [2462938] 

(View GenBank report,FASTA report, ASN.l rcpon,Graphical view, 1 MEDLINE link, 10 protein links, or 3 1 nucleotide neighbors 
X01828 

Bacteriophage PI recombinase gene cin 
gi|15133|emb|X01828|MYPlCIN [15133] 

(View GenBank report,FASTA rcport.ASN.1 rcport,Graphical view,l MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
X98146 

Bacteriophage PI DNA sequence around the Op88 operator 
gi|1359513|cmb|X98146|BP10P880P [1359513] 

(View GenBank report,FASTA report^SN.l report,Graphical view, or 1 nucleotide neighbor ) 
S61175 

irnrnl operon: icd=cell division repressor, antl=anrirepressor {promoters 
P51a, P51b} [bacteriophage PI, Genomic, 728 nt] 
gi|385908|gb|S61 175JS61 175 [385908] 

(View GenBank report,FASTA repon^SN.l report, Graphical view,l MEDLINE link, or 3 nucleorideneighbors ) 

X87824 

Bacteriophage PI gene 26 

gi|86 1 1 64|emb|X87824|XXBP 1 G26 [861164] 

(View GenBank report,FASTA reporv\SN.l rcport,Grapbical view, or 1 protein link ) 
XI 5638 

Phage PI DNA for lytic replicon containing promoter P53 and two open reading frames 
gi|15735|cmb|X15638|PPlLREP [15735] 

(View GenBank report,FASTA reportASN.l report,Graphical view,l MEDLINE link, 3 protein links, or 24 nucleotide neighbors ' 
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XI7512 

Bacteriophage PI DNA for immunity reeion imml 
gi|l5479|emb|X175I2|PlIMMUNIY (15479] 

(View GenBank repon,FASTA rcport,ASN.l report,Graphical view,2 MEDLINE links; or 4 nucleotide neighbors ) 
XI 6005 

Bacteriophage PI cl gene forPlcl repressor protein 
gi|15477|emb|X16005|PlCl [15477] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,! MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
X03453 

Bacteriophage PI ere gene for recombinase protein 
gi|15135|emb|X03453|MYPlCRE [15135] 

(View GenBank report,FASTA rcport,ASN.l report,Graphical view, 1 MEDLINE link, 2 protein links, or 12 nucleotide neighbors ) 
X06561 

Bacteriophage PI cl gene 5-region 
gi|l5128|emb|X06561|MYPlCl [15128] 

(View GenBank reportJFASTA reporUSN. 1 report,Graphical view, I MEDLINE link, 4 protein links, or 6 nucleotide neighbors ) 
V01534 

Bacteriophage PI genome fragment (IS2 insertion spot). This regions contains 

four unidentified reading frames and is known as insertion hot spot for IS2 insertion sequences 

gi|15118|emb|V01534|MYOVPl [15118] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, 1 MEDLINE link, 4 protein links, or 3 nucleotide neighbors ) 

X5695I 
Bacteriophage PI gene 10 
gi|406728|emb|X56951|BPPIGP10 [406728] 

(View GenBank reportJASTA reporUSN.l report,Graphicai view,2 MEDLINE links, 3 protein links, or 1 nucleotide neighbor ) 
K02380 

Bacteriophage PI replication region including repA, parA, and parB genes and 
inc A, incB, and incC incompatibility detenninants 
gi|2l5652|gb|K02380|PP!REP [215652] 

(View GenBank report,FASTA reporUSN.l report,Graphical view, 5 MEDLINE links, 4 protein links, or 8 nucleotide neighbors ) 
X87674 

Bacteriophage PI lydA & lydB genes 
gi|974763|emb|X87674|BACPlLYD [974763] 

(View GenBank report,FASTA reporUSN.l repor^Graphical view, I MEDLINE link; 2 protein links, or 2 nucleotide neighbors ) 

X87673 
Bacteriophage PI gene 17 
gi|974761|emb|X87673|BACPl 17 [974761] 

(View GenBank report,FASTA reportASN.l report,Graphical view,l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
M16618 

Bacteriophage PI cl repressor binding sites 
gi|215600|gb|M16618(PPlCl [215600] 

(View GenBank repooFASTA report^SN.l report,Graphical view,l MEDLINE link, 2 protein links, or 3 nucleotide neighbors ) 
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SEG PP1CIN 

(View GenBank report,FASTA rcpon^SN.i report,Graphical view, 1 MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
K03I73 

Bacteriophage PI C invertible element, right end, and cixR recombination site 
gi|215606|gb|K03173|PPlCIN2 [215606] 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 
215605 

£|5mS^!k» ^TsSQS)^ rCCOmbinaSe ' dxL recombination site, and 5' end of C invenible element 
(View GenBank report,FASTA report,ASN.l report, or Graphical view) 
M25470 

Bacteriophage PI tail fiber protein gene, complete cds 
gi|34l349|gb|M25470|PPlTFPR [341349] 

(View GenBank repoitJASTA reporUSN.i report,Graphical view,! MEDLINE link, 3 protein links, or 3 nucleotide neighbors ) 
M34382 

Bacteriophage P 1 sim region proteins, complete cds 
gi!2!5661|gb|M34382|PPiSIM [215661] 

(View GenBank report,FASTA report^SN. 1 report,Graphicai view,l MEDLINE link, or 2 protein links ) 
M81956 

Bacteriophage PI R protein (R) gene, complete cds 
gi|2!5658|gb|M81956|PP!RP [215658] 

(View GenBank report,FASTA reporV\SN.l report,Grapbicai view.I MEDLINE link, 2 protein links, or 4 nucleotide neighbors j 
M37080 

Bacteriophage PI rrrini-Pl plasmid origin of replication 
gi|215657[gb|M37080|PP!REPOR [215657] 

(View GenBank rcport,F AST A report,ASN.l report,Graphical view,l MEDLINE link, or 46 nucleotide neighbors ) 
M27041 

Bacteriophage PI ref gene, complete cds 
gi|215650|gb|M27041|PP!REF [215650] 

(View GenBank rcport,FASTA reporUSN.l report,Graphical view,l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
L01408 

Bacteriophage PI partition protein (parB) gene, 3* end . 
gi|2 1 5642|gb|L0 1 408|PP 1 PARB [2 1 5642] 

(View GenBank report,FASTA reporg\SN.l report,Graphical view.l protein link, or 41 nucleotide neighbors ) 

SEGJ>P1PAR 
Bacteriophage rniniplasmid PI parA gene, 5' end 
gi|215639|gb||SEGJPPlPAR [215639] 

(View GenBank re^ort,FASTA reporUSN. 1 report,Grapbical view, 1 MEDLINE link, 2 protein links, or 48 nucleotide neighbors ) 
M36425 

Bacteriophage miniolasmid PI parB gene, 3' end 
gt|2 1 5638|gb|M3 6425|PP 1 PAR2 [2 15638] 

(View GenBank report,FASTA report^ASN.i report, or Graphical view) 
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M36424 - IJ 
Bacteriophage miniplasmid PI parA gene, 5' end 
gi|2l5637|gb|M36424|PPlPARl [215637] 

(View GenBank rcport,FASTA report,ASN.l report, or Graphical view) 
Ml 1129 

Bacteriophage PI miniplasmid origin of replication region 
gi|215632igb|Ml 1 129|PP10RIM [215632] 

(View GenBank report,FASTA report,ASN.l report, Graphical view,l MEDLINE link, 1 protein link, or 43 nucleotide neighbors ) 
M25414 

Bacteriophage PI cl repressor binding site, operator 88 (Op88) 
gij215631|gb|M25414|PP10P88A [215631] 

(View GenBank repon,FA5TA report,ASN.l report,Graphical view, 1 MEDLINE link, or 3 nucleotide neighbors ) 
M25413 

Bacteriophage PI cl repressor binding site, operator 68 (Op68) 
gi|2l5630|gb|M25413|PP!OP68A [215630] 

(View GenBank report,FASTA report^ASN.l report,Graphical view, or 1 MEDLINE link ) 
M25412 

Bacteriophage PI cl repressor binding site, operator 2 1 (Op21) 
gt|2 1 5629|gb|M254 1 2|PP 1 OP2 1 A [2 15629] 

(View GenBank report,FASTA report^ASN.l repon,Graphical view,l MEDLINE link, or 1 nucleotide neighbor ) 
M 105 10 

Bacteriophage PI recombination site loxR 
gi|2 1 5628|gb|M 1 05 1 0|PP 1 LOXR [2 1 5628] 

(View GenBank report,FASTA report^SN. i report,Graphical view, 1 MEDLINE link, or 1 nucleotide neighbor ) 
M 10287 

Bacteriophage PI loxP X loxP recombination site 
gi|2!5627|gb|M10287|PPlLOXPX [215627] 

(View GenBank reportJASTA report^.SN.1 report,Graphical view,l MEDLINE link, or 13 nucleotide neighbors ) 
M10494 

Bacteriophage PI recombination site loxP 
gi|215626|gb|M10494|PPlLOXP [215626] 

(View GenBank report,FASTA report^SRl report,Graphical view.l MEDLINE link, or 134 nucleotide neighbors ) 
M10511 

Bacteriophage P 1 recombination site IcocL 
gi|215625|gb|M1051 1|PPIL0XL [215625] 

(View GenBank repoi^FASTA report r ASN.l repor^Graphical view,l MEDLINE link, or 1 nucleotide neighbor ) 
M10512 

Bacteriophage P I recombination site ioxB 
gi|215624|gb|M10512|PPlLOXB [215624] 

(View GenBank report,FASTA report^ASN. 1 report,Graphical view, or 1 MEDLINE link ) 
MI0145 

Bacteriophage PI genome fragment with recombination site loxP 
gi|2 1 5623|gb|M 1 0 1 45|PP I CREX [2 1 5623] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE link, or 21 nucleotide neighbors ) 
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MI3327 

Bacteriophage PI Cin recombinase activated cross over site, junction IV, clone oSHI326 
gi|2!5622|gb|M13327|PPiCN26IV [215622] 

(View GenBank rcport,FASTA report, ASN. 1 report,Graphical view, 1 MEDLINE link, or 7 nucleotide neighbors ) 
M13325 

Bacteriophage PI Cin recorabinase activated cross over site, junction II, clone oSHI326 
gi|215621|gb|M13325|PPlCN26II [215621] 

(View GenBank report,FASTA report.ASN. 1 report,Graphical view, 1 MEDLINE link, or 1401 nucleoride neighbors ) 
M13323 

Bacteriophage PI Cin recombinase activated cross over site, junction IV, clone dSHI325 
gi|2i5620|gb|M13323|PPlCN25IV [215620] P 

(View GenBank rcport,FASTA report, ASN. 1 report,Graphical view, 1 MEDLINE link, or 7 nucleotide neighbors ) 
M1332I 

Bacteriophage PI Cin recombinase activated cross over site, junction II, clone oSHI325 
gi|2 1 56 1 9|gb|M 1 332 1 |PP I CN25II [215619] P 

(View GenBank report,FASTA report^SN. 1 repon,Graphical view, 1 MEDLINE link, or 1058 nucleotide neighbors ) 
MI3324 

Bacteriophage PI Cin recombinase activated cross over site, junction I. clone dSHI326 
gi|2 1 561 8|gblM 1 3324|PP 1 CIR26I [215618] 

(View GenBank report^ASTA reportASN. 1 report, Graphical view, 1 MEDLINE .link, or 7 nucleotide neighbors ) 
M13319 

Bacteriophage PI Cin recombinase activated cross over site, right junction, clone oSHD27 
gi|215617|gb|M13319|PP!CIN27R[215617] P 

(View GenBank report,FASTA report^SN. 1 rcport,Graphical view, 1 MEDLINE link, or 7 nucleotide neighbors ) 
M13320 

Bacteriophage PI Cin recombinase activated cross over site, junction I, clone pSHI325 
gi|2156l6|gb|M13320|PPlCIN25I [215616] 

(View GenBank report^ASTA report^SNU report,Grapbical view,l MEDLINE link, or 7 nucleotide neighbors ) 
MI3318 

Bacteriophage PI Cin recombinase activated cross over site, left junction, clone pSHD24 
gi|2l5615|gb|M13318|PPlCIN24L [215615] 

(View GenBank reportFASTA report^SN. 1 report,Grapbical view, 1 MEDLINE link, or 1370 nucleotide neighbors ) 
M13317 

Bacteriophage P 1 Cin recombinase activated cross over site, right junction, clone pSHD23 
gi!215614|gb|M13317|PPlCIN23M [215614] 

(View GenBank reportFASTA repcnASN.l report,Graphical view.l MEDLINE link, or 1055 nucleotide neighbors ) 
M133I6 

Bacteriophage PI Cin recombinase activated cross over site, left junction, clone pSHI323 
gi|2 1 56 1 3|gb|M 1331 6|PP 1 CIN23L [215613] 

(View GenBank rcport,FASTA report,ASN.l report, Graphical view,l MEDLINE link, or 7 nucleotide neighbors ) 
M13315 

Bacteriophage PI Cin recombinase activated cross over site, right junction, clone pSHI322 
gi|215612|gb|M13315|PP!CIN22R [215612] 

(View GenBank report,FASTA report,ASN\l report,Grapbical view,l MEDLINE link, or 7 nucleotide neighbors ) 
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M13314 

Bacteriophage PI Cin recombinase activated cross over site, left junction, clone pSHI322 
gi|2 i 56 1 1 |gb|M 1 33 1 4|PP 1 CIN22L [215611] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, 1 MEDLINE link, or 1401 nucleotide neighbors ) 
M13313 

Bacteriophage PI Cin recombinase activated cross over site, right junction, clone pSHI321 
gi|215610|gb|MI3313|PPiCIN21R [215610] 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE link, or 7 nucleotide neighbors ) 
M13312 

Bacteriophage PI Cin recombinase activated cross over site, left junction, clone pSHI321 
gi|2 1 5609|gb|M 1 3 3 1 2 |PP 1 CIN2 1 L [2 1 5 609] 

(View GenBank report,FASTA rcport,ASN.l report,Graphical view.l MEDLINE link, or 1058 nucleotide neighbors ) 
Ml 6568 

Bacteriophage PI c4 repressor gene, complete cds 
gi|2 1 5603|gb|M 1 6568|PP 1 C4 [2 1 5603] 

(View GenBank report,FASTA repoit^\SN.l report,Graphical view, I MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
M13326 

Bacteriophage PI Cin recombinase activated cross over site, junction III, clone pSHI326 
gU2 1 5602|gb|M 1 3326|PP 1 C26III [2 1 5602] 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE link, or 1192 nucleotide neighbors ) 
M 13322 

Bacteriophage PI Cin recombinase activated cross over site, junction m, clone pSHI325 
gi|215601|gb|M13322|PPlC25m [215601] 

(View GenBank reportJASTA report^ASN.l report,Graphical view,! MEDLINE link, or 1231 nucleotide neighbors ) 
J05651 

Bacteriophage PI modulator protein (bof) gene, complete cds 
gi|2I5598|gb|J0565l|PPlBOFYl (215598] 

(View GenBank report,FASTA report^SN.l report,Graphical view, I MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
M33224 

Bacteriophage PI regulatory protein (bof) gene, complete cds 
gi|2 1 5596|gb|M33224|PP 1 BOFFO [2 15596] 

(View GenBank rcport,FASTA report^ASN.l report,Graphical view,l MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
MI 0288 

E.coli/bacteriophage P 1 loxR recombination site 
gi|146647|gb|M10288|ECOLOXR [146647] 

(View GenBank report,FASTA reportASN.l report,Graphical view.l MEDLINE link, or 3 nucleotide neighbors ) 
M 10289 

E.coli/bacteriophage PI loxL recombination site 
gi|146646|gb|M10289|ECOLOXL [146646] 

(View GenBank reportJASTA report^ASN.l report,Graphical view,l MEDLINE link, or 2 nucleotide neighbors ) 
M 10290 

E.coli loxB site, which can recombine with bacteriophage PI loxP site 
gi|146645|gb|M10290|ECOLOXB [146645] 

(View GenBank report,FASTA report^SN.l report, Graphical view,! MEDLINE link, or 2 nucleotide neighbors ) 
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M 10287 

Bacteriophage PI loxP X loxP recombination site 
gi|2 1 5627|gb|M 1 0287|PP 1 LOXPX [2 15627] 

(View GenBank report f FASTA report,ASN.l report,Graphicai view, 1 MEDLINE link, or 13 nucleotide neighbors ) 
M74046 

Bacteriophage PI pacA and pacB genes, complete cds 
gi|2 1 5634|gb|M74046|PP 1 PACAB [2 15634] 

(View GenBank report,? ASTA report,ASN. 1 report t GraphicaI view.l MEDLINE link, or 2 protein links ) 
M95666 

Bacteriophage PI gene 10, doc and phd genes, complete cds 
gi|463276|gb|M95666|PPlPHDDOC [463276] 

(View GenBank report,FASTA report,ASN.l repon,Graphical view,2 MEDLINE links, 4 protein links, or 1 nucleotide neighbor ) 
M25604 

Bacteriophage Q-beta mutated autonomously replicating sequence MDV1 RNA fragment 
gi|556359|gb|M25604|PQBARSMUT [556359] 

(View GenBank report,FASTA reportASN.l report,Graphical viewj MEDLINE link, or 8 nucleotide neighbors ) 
V00643 

first half of the phage Q-beta gene for coat protein 
gi|I5088|emb|V00643|LEQBET [15088] 

(View GenBank report,FASTA reporUSN.l report,Graphical view.l MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
M25167 

Bacteriophage Q-beta RNA fragment recovered from replicase binding complex 
gi|556362|gb|M25 1 67|PQBREPLICB [556362] 

(View GenBank report,FASTA report.ASN.1 report,GraphicaI view, I MEDLINE link, or 2 nucleotide neighbors ) 
M24876 

Bacteriophage Q-beta replicase RNA, 5' end 
gi|556360|gb|M24876|PQBREPUCA [556360] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,l MEDLINE link, i protein link, or 4 nucleotide neighbors ) 
M25444 

Synthetic bacteriophage Q-beta DNA 

gi|2091 18|gb|M25444|SYNPQBTERM [2091 18] 

(View GenBank report,FASTA report^SN.l report,Graphical view, 1 MEDLINE link, or 8 nucleotide neighbors ) 
M25463 

Bacteriophage Q-beta self-replicating microvariant (+) RNA 
gi|532489|gb|M25463|PQBMVSRRNA [532489] 

(View GenBank report,FASTA reporvASN. 1 report,Graphical view, or 1 MEDLINE link ) 
M25014 

Bateriophage Q-bcta RNA replicase gene, 5'end, and maturation protein gene, 3' end 
gi|294316|gb|M25014|PQBREPLC [294316] 

(View GenBank report,FASTA reporUSN.l report,Graphical view,l MEDLINE link, 2 protein links, or 2 nucleotide neighbors } 
M25065 

Bacteriophage Q-beta RNA sequence with putative stem loop 
gi|294315|gb|M25065|PQBLOOP [294315] 

(View GenBank rcport,FASTA rcport^SN.l report,Graphical vicw,l MEDLINE link, or 3 nucleotide neighbors) 
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M 10265 

Bacteriophage Q-beta RNA molecule with the ability to replicate extracellularly 
gi|215726|gb|M10265|PQBRNA [215726] 

(View GenBank report.FASTA rcport,ASN. 1 report.Graphical view, I MEDLINE link, or 8 nucleotide neighbors ) 
M24815 

Bacteriophage Q-beta specified rcplicase subunit RNA 
gi|215725|gb|M24815|PQBREPL [215725] 

(View GenBank report.FASTA report,ASN. 1 report,Graphical view, I MEDLINE link, or 4 nucleotide neighbors ) 
M25461 

Bacteriophage Q-beta plus-strand RNA, 5' terminus 
gi|2 l5724|gb|M2546 1 |PQBPS5E [2 1 5724] 

(View GenBank repon,FASTA report^SN.l report, or Graphical view) 
M25462 

Bacteriophage Q-beta plus-strand RNA, 3' terminus 
gi|215723|gb|M25462|PQBPS3E [215723] 

(View GenBank repon,FASTA reporUSN.l report,Graphical view, or 8 nucleotide neighbors ) 
M2487I 

Bacteriophage Q-beta nanovariant WSIII RNA 
gi|2 1 5722|gb|M24 87 1 |PQBNVWSIC [2 1 5722] 

(View GenBank report,FASTA report^SN.l report,Graphical view, I MEDLINE link, or 2 nucleotide neighbors ) 
M24870 

Bacteriophage Q-beta nanovariant WSII RNA 
gi|2 1 572 1 |gb|M24870|PQBNVWSIB [2 1572 1 ] 

(View GenBank rcport,FASTA report^SN.l report,GraphicaI view, I MEDLINE link, or 2 nucleotide neighbors ) 
M24869 

Bacteriophage Q-beta nanovariant WSI RNA 
gi|2 1 5720[gb|M24869|PQBNVWSIA [2 1 5720] 

(View GenBank report,FASTA reporUSN.l report,GraphicaI view.l MEDLINE link, or 2 nucleotide neighbors ) 
Ml 0495 

Coliphage Q-beta MDV-1(+) RNA 

gi|2 1 57l9|gb|M I0495|PQBMDVl A [2 15719] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, I MEDLINE link, or 10 nucleotide neighbors ) 
J02484 

bacteriophage qbeta coat protein cistron first half 
gi|2157l7|gb|J02484IPQBCP5 [215717] 

(View GenBank reportJASTA reporUVSN.l report,Graphical view,l MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
M57754 

Bacteriophage Q-beta minus strand RNA, 5* terminus 
gi|2 1571 6|gb|M57754|PQBBMS5E [215716] 

(View GenBank repbrt,FASTA report^SN.l report,Graphical view, or 8 nucleotide neighbors ) 
M24297 

Bacteriophage Q-beta 5 -terminal region of the minus strand 
gi|2 1 571 5|gb|M24297|PQB5END [215715] 

(View GenBank report,FASTA rcport^SN.l report,Graphical view, 1 MEDLINE link, or 8 nucleotide neighbors ) 
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Bacteriophage Q-beta, MDV- 1 RNA 
gi|2 1 57 1 4|gb|M 1 0695|PQB 1 IR [2 1 57 14] 

(View GenBank report.FASTA report,ASN.l report,Graphical view,2 MEDLINE links, or 12 nucleotide neighbors ) 
M24827 

Bacteriophage R17 A protein gene, 5' end 
gi|2 1 6078|gb|M24827|Rl 7RNACIS [2 i 6078] 

(View GenBank rcport,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, or 5 nucleotide neighbors ) 
M24829 

Bacteriophage R17 coat protein gene, 5' end 
gi|216075|gb|M24829|R17CP5 [216075] 

(View GenBank report,FASTA report,ASN.l repon,Graphical view, I MEDLINE link, or 5 nucleotide neighbors ) 
J02488 

bacteriophage rl7 ma synthetase initiation site 
gi|2 1 6080|gb| J02488|R1 7RNASYN [2 1 6080) 

(View GenBank reportfASTA report^SN. 1 report,Graphical view,3 MEDLINE links, 2 protein links, or 6 nucleotide neighbors ) 
J02487 

bacteriophage r!7 coat protein initiation site 
gi|216073|gb|J02487|RI7COATP [216073] 

(View GenBank report,FASTA repon^SN. 1 report,Graphical view, or 1 MEDLINE link ) 
J02486 

bacteriophage rl7 a protein initiation site 
gi|2 1 607 1 |gb|J02486|R 1 7APROT [2 1 607 1 ] 

(View GenBank report,FASTA report^ASN.l report,Graphical view, or 1 MEDLINE link ) 
M24826 

Bacteriophage R17 coat protein RNA fragment 
gi|216077|gb|M24826|R17CPRAA [216077] 

(View GenBank rcportJFASTA repooASN. 1 report,Graphicai view,l MEDLINE link, or 7 nucleotide neighbors ) 
M24296 

Bacteriophage R17 3 , -terminaI fragment A RNA 
gi|216070|gb|M24296|R173TFA [216070] 

(View GenBank report^ASTA report t ASN.l report,Graphical view.l MEDLINE link, or 9 nucleotide neighbors ) 
1TFN 

soructure refinement for a 24-nucleotide rna hairpin, nmr, nunimized average 

structure ribonucleic acid, hairpin, bacteriophage r!7 mol_id: 1; molecule: rl7c; chain; null; engineered; yes 
gi|i942336|pdb|lTFNl [1942336] 

(View GenBank reportJFASTA report^SN.l report,Grapbical view, or 1 structure link ) 
1RPEA 

rna (5'«d(gpgpgpapcpupgpapcpgpapupcpapcpgp cpapgpupcpupapu-3') (24-mer rna 
hairpin coat protein binding site for bacteriophage rl7) (nmr, rrnnimized average smicrure) 
gi|1421020|pdb|lRHTl [1421020] 

(View GenBank reportJASTA repoiV\SN.l report,Graphical view, or 1 structure link ) 
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Ml 4428 

Bacteriophage S 13 circular DNA, complete genome 
gi!2 1 6089|gb|M 1 4428|S 1 3CG (2 1 6089] 

(View GenBank report,FASTA report,ASN. 1 rcport,Graphical view,2 MEDLINE links, 1 2 protein links, 26 nucleotide neighbors 
or 1 genome link ) 5 

J05393 

Bacteriophage Tl DNA N-6-adenine-mcthyltransferase (M.TI) gene, complete cds 
g ijl66163|gb|J05393iBTlNAMTA [166163] 

(View GenBank rcport,FASTA report,ASN.l report,Graphical view,l MEDLINE link, or 2 protein links ) 
L46845 

Bacteriophage T2 frd3, frd2 genes, comolete cds 
gij95i387|gb|L46845|PT2FRD32G [951387] 

(View GenBank report,FASTA rcport,ASN.l report,Graphical view,2 protein links, or 17 nucleotide neighbors ) 
L43611 

Bacteriophage T2 fibritin (wac) gene, complete cds 
gi!903869!gb|L436i 1|PT2WAC [903869] 

(View GenBank report,FASTA reporUSN.l report,Graphical view,l protein link, or 4 nucleotide neighbors ) 
M24812 

Bacteriophage T2 secondary structure RNA sequence 
gi|215796|gb|M24812|PT2RNA [215796] 

(View GenBank reportJFASTA reporUSN. 1 report,Graphical view,l MEDLINE link, or 4 nucleotide neighbors ) 
M22342 

Bacteriphage T2 DNA-(adenine-N6)methyitransferase (dam) gene, complete cds 
gi|2 1 5792|gb|M22342|PT2DAM [2 1 5792] 

(View GenBank report,FASTA report,ASN.l report.Graphical view,l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
S57515 

orf 6 1 .2 { intergenic region between 4 1 and 6 1 } [bacteriophage T2, Genomic, 323 nt] 
gi|298524|gb|S57515|S57515 [298524] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l MEDLINE link, or 1 protein link ) 
X05312 

Bacteriophage T2 gene 38 for receptor recognizing protein 
gi|15197|emb[X05312|MYT2G38 [15197] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link, or 1 protein link ) 
X04442 

Bacteriophage T2 gene 37 for receptor recognizing protein 
gi!15195|emb|X04442|MYT2G37 [15195] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view, 1 MEDLINE link, or 1 protein link ) 
X12460 

Bacteriophage T2 gene 32 mRNA for single-stranded DNA binding protein 
gi|15l92|emb|X12460|MYT2G32 [15192] 

(View GenBank report,FASTA report^SN.l report,Grapbical view,l MEDLINE link, 2 protein links, or 14 nucleotide neighbors ) 
X57797 

Bacteriophage T2 gene for gpl2 

gi| I4875|emb|X56555|BT2GP 1 2 [ 1 4875] 

(View GenBank report,FASTA report,ASN.l rcport,Graphical view.l protein link, or 2 nucleotide neighbors ) 
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X01755 

Bacteriophage T2 tail fiber gene 36 
gi|l5189lemb|X01755|MYT2F36 [15189] 

(View GenBank report,FASTA report^SN.l report,GraphicaI view, 1 MEDLINE link, 2 protein links, or 1 nucleotide neighbor ) 
M 14784 

g%?5^ "* flbCr ^ ^ «* DNA I"—, -mplete cds 

(View GenBank reportJASTA report,ASN.l report,Graphical view,l KlEDLINE link, 9 protein links, or 10 nucleotide neighbors ) 

SEG_PT3 RNAPOL 
Bacteriophage T3 RNA polymerase III gene, 5' end 
gi|7 1 0559|gb||SEG_PT3 RNAPOL [7 1 0559] 

(View GenBank report,FASTA reporUSN.l report,Graphical view,l MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
M22610 

Bacteriophage T3 RNA polymerase 01 gene, 3' end 

gi|340722|gb|M22610|PT3RNAPOL2 [340722] 

(View GenBank report^ASTA repoi%ASN.l report, or Graphical view) 

M22609 

Bacteriophage T3 RNA polymerase III gene, 5' end 

gi|340721|gb|M22609|PT3RNAPOLl [340721] 

(View GenBank report,FASTA report^SN. 1 report, or Graphical view) 

X05031 

Bacteriophage T3 gene region 1-2.5 with primary origin of replication 
gi|15719|emb|X05031|POT3ORI [15719] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE link, 1 1 protein links, or 5 nucleotide neighbors ) 
X03964 

Bacteriophage T3 early control region pos. 308-810 from genome left end 
gi|157!8!emb|X03964|POT3EP [15718] 

(View GenBank report,FASTA reporUSN.l report,Graphical view,2 MEDLINE links, or 20 nucleotide neighbors ) 
X17255 

Bacteriophage T3 gene 1 to gene 1 1 
gi)l5682|emb|X17255|POT31 1 1G [15682] 

(View GenBank report,FASTA report^SN.l report,Graphical view,4 MEDLINE links, 36 protein links, 17 nucleotide neighbors 
or 1 genome link ) 

X15840 
Phage T3 gene 10 

gi|15625|emb|X15840|PODT3G10 [15625] 

(View GenBank reportJASTA report^VSN.l report,Graphical view.l MEDLINE link, or 3 nucleotide neighbors ) 
X02981 

Bacteriophage T3 gene 1 for RNA polymerase 
gi|15561|emb|X0298I|PODOT3P [15561] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
J02503 

bacteriophage t3 5' end, terminally redundant sequence (trs) 
gi|215816|gb|J02503|PT3TRSl [215816] 

(View GenBank rcportJASTA report^SN.l report, or Graphical view) 
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SEG_PT3TRS 

bacteriophage i3 5' end, tenninally redundant sequence (trs) 
gi|2l5818|gb|ISEG_PT3TRS [215818] 

(View GcnBank report,FASTA report,ASNl report,Graphical view, or 1 MEDLrNE link ) 
J02504 

bacteriophage t3 3* end, terminally redundant sequence (trs) 
gt|2 1 58 1 7|gbiJ02504|PT3TRS2 [215817] 

(View GenBank report.FASTA repon t ASN.l report, or Graphical view} 

H YPERLIhnChttp^/wwwTS.noda.sut.acjp/-kunisawa h t tp://www.rs.noda.suLac.jp/-kunisawa 
Bacteriophage T4 genomic database compiled by Arisaka et aL 

X95646 

Bacteriophage T5 DNA for region 60.5°/o-71% of the T5 genome 
gi|2791557|emb|AJ001 19 I|BTJ001 191 [2791557] 

(View GenBank report,FASTA reporUSN. 1 report,Graphical view,7 MEDLINE links, 12 protein links, or 6 nucleotide neighbors ) 
X56847 

Bacteriophage T5 genomic region encoding early genes D10-D15 
gi| 1 5407|emb|X 12930|MYT5D 10 [15407] 

(View GenBank report,FASTA report,ASN.l report,Graphical view.l MEDLINE link, 5 protein links, or 4 nucleotide neighbors ) 
AF039886 

Bacteriophage T5 subclone T5.5.3r5.18r, single pass sequence, genomic survey sequence 

gi|28l 1 154|gb|AF039886|AF039886 [281 1 154] 

(View GenBank reportJASTA report^\SN.l report, or Graphical view) 

AF039885 

Bacteriophage T5 subclone T5.40f,4 1 f, single pass sequence, genomic survey sequence 

gij28U 153|gb|AF039885|AF039885 [281 1 153] 

(View GenBank reportJASTA report^SN.l report, or Graphical view) 

AF039884 

Bacteriophage T5 subclone T5.26.fr, single pass sequence, genomic survey sequence 

gi|281 1 l52|gb|AF039884|AFO39884 [281 1 152] 

(View GenBank report,FASTA rcport,ASN. I report, or Graphical view) 

AF039883 

Bacteriophage T5 subclone 10-T5.5.7F, single pass sequence, genomic survey sequence 

gi|28 11151 |gb|AF039883|AFO39883 [2811151] 

(View GenBank reportJASTA report^SN.i report, or Graphical view) 

AF039882 

Bacteriophage T5 subclone 4 1 -T5.5.4BF, single pass sequence, genomic survey sequence 

gi|28 1 1 1 50|gb|AF039882[AF039882 [28 1 1 150] 

(View GenBank report^ASTA report^SN. 1 report, or Graphical view) 

AF039881 

Bacteriophage T5 subclone 39-T5.5.4aF, single pass sequence, genomic survey sequence 
giI2811 149|gb|AF039881|AF03988I [2811149] 

(View GenBank repon,FASTA report^SN.l report,Graphical view, or 1 
nucleotide neighbor ) 
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AF039880 

Bacteriophage T5 subclone I9-T5.7.2r, single pass sequence, genomic survey sequence 

gi|28 1 1 148|gb|AF03 988 0|AF03 9880 [281 1 148J 

(View GenBank repor^FASTA report^SN.l report, or Graphical view) 

AF039879 

Bacteriophage T5 subclone 18-T5.7.2F, single pass sequence, genomic survey sequence 

gi|28M 147|gb|AF039879|AF039879 [281 1147] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 

AF039878 

Bacteriophage T5 subclone 1 1-T5.5.7R, single pass sequence, genomic survey sequence 
gi|28 1 1 1 46|gb|AF039878|AF039878 [28 11146] 

(View GenBank rcport,FASTA report,ASN.l report,Graphical view, or 2 
nucleotide neighbors ) 

AF039877 

Bacteriophage T5 subclone T5.4FR, single pass sequence, genomic survey sequence 

gi|281 i 145|gb|AF039877|AF039877 [281 1 145] 

(View GenBank reportJASTA report,ASN.l report, or Graphical view) 

AF039876 

Bacteriophage T5 subclone 22-T5.16R, single pass sequence, genomic survey sequence 

gi|28 1 1 144|gb|AF039876|AF039876 [28 1 1 1 44] 

(View GenBank report, FASTA report^ASN.l report, or Graphical view) 

AF039875 

Bacteriophage T5 subclone 21-T5.16R, single pass sequence, genomic survey sequence 

gi|2811143|gb(AF039875|AF039875 [2811143] 

(View GenBank report,FASTA report^\SN.l report, or Graphical view) 

AF039874 

Bacteriophage T5 subclone 2 1-T5.16T, single pass sequence, genomic survey sequence 

gi|28 1 1 142|gb|AF039874|AF039874 [28 1 1 142] 

(View GenBank report,FASTA reportASN. 1 report, or Graphical view) 

AF039873 

Bacteriophage T5 subclone 09-T5.6F, single oass sequence, genomic survey sequence 

gi|28H141|gb|AF039873|AF039873 [2811141] 

(View GenBank reportJASTA reportASN.l report, or Graphical view) 

AF039872 

Bacteriophage T5 subclone 09-T5.6R, single pass sequence, genomic survey sequence 
gi|281U40|gb|AF039872|AF039872 [2811140] 

(View GenBank reportJASTA reportASN. 1 report,Grapbical view, or 2 nucleotide neighbors ) 
AF039871 

Bacteriophage T5 subclone 04-T5.26JI, single pass sequence, genomic survey sequence 

gi|281 1 139|gb|AF039871|AF039871 [281 1 139] 

(View GenBank report,FASTA report^ASN.l report, or Graphical view) 

AF039870 

Bacteriophage T5 subclone 13-T5.42F, single pass sequence, genomic survey sequence 

gi|28 1 1 138|gb|AF039870|AF039870 [281 1 1 38] 

(View GenBank reportJASTA reportASN. I report, or Graphical view) 



\ 
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X69460 

Bacteriophage T5 ltf gene for L-shaped toil fibers 
gi|15415|emb|X69460|MYT5LTF [15415] 

(View GenBank report,FASTA reporUSN. 1 report,Graphical view,2 MEDLINE links, 1 protein link, or 4 nucleotide neighbors ) 
X03402 

Bacteriophage T5 D15 gene for 5' exonuclease 
gi|15413|ernb|X03402|MYT5EXOG [15413] 

(View GenBank repon.FASTA report,ASN.l report,Graphical view,l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
211972 

Bacteriophage T5 tRNA-Tyr, tRNA-Glu, tRNA-Trp, tRNA-Phe, tRNA-Cys and 
iRNA-Asn genes, and ORFs 91 aa, 90aa, 42aa and 172aa 
gi|15795|emb|Z11972|T56TRNAG [15795] 

(View GenBank report,FASTA rcport,ASN. 1 rcport,Graphical view, 1 MEDLINE link, 4 protein links, or 3 nucleotide neighbors ) 
X03898 

Bacteriophage T5 genes for tRNA-His, -Ser and -Leu 
gi!15786|emb|X03898|STT5RNl [15786] 

(View GenBank report,FASTA report,ASN.l report,Graphical view, or 2 MEDLINE links ) 
X04177 

Bacteriophage T5 gene for transfer RNA-Gln 
gi|15421|emb|X04177|MYT5TRNQ [15421] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, or 2 nucleotide neighbors ) 
X03899 

Bacteriophage T5 genes for tRNA-Val, -Lys, -fMet, -Pro and-Ile3 
gi|I5787|emb|X03899|STT5RN2 [15787] 

(View GenBank report,FASTA report^ASN. 1 report,Graphical view, or 1 MEDLINE link ) 
X03798 

Bacteriophage T5 gene for tRNA-Asp (GUC) 
gi|15472|emb|X03798|NCT5TRDG [15472] 

(View GenBank report,FASTA report^ASN. 1 report,Graphical view, 1 MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
Y00364 

Bacteriophage T5 tRNA gene cluster (27.8%-22.4%) 
gqi5420|emb|Y00364|MYT5TRN [15420] 

(View GenBank repon,FASTA report^SN.l report,Graphical view,l MEDLINE link, or 13 nucleotide neighbors ) 
X03140 

Bacteriophage T5 DNA with rho-dependent transcription terminator (Hind IH-P fragment) 
gi|15417|embpC03140|MYT5RHO [15417] 

(View GenBank reportJASTA reportASN.l report,Graphical view,l MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 

Z35070 
Bacteriophage T6 DNA 

gi|535228|emb|Z35074!MYEREGBT6 [535228] 

(View GenBank rcport,FASTA report^ASN. I report,Graphical view,l MEDLINE link, or 1 protein link ) 
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AF060870 

Coliphage T6 small subunit distal tail fiber (gene 36) gene, partial cds; and large subunit distal tail fiber (gene 37) and tail fiber 
adhesin (gene 38) genes, complete cds 
gi|3676458|gb|AF052605|AF052605 [3676458] 

(View GenBank repon,FASTA report^SN.l report,Graphical view t 3 protein links, or 2 nucleotide neighbors ) 
235072 

Bacteriophage T6 DNA encoding ORF19.1 gene and gl9 gene 
gi|535232|embl235072!MYTAILT6 [535232] 

(View GenBank report,FASTA rcport,ASN.l report,Graphical view.l MEDLINE link, or 2 protein links ) 
XI 2488 

Bacteriophage T6 gene 32 mRNA for single-stranded DNA bindine protein 
gi|l5843|emb|X12488|MYT6G32 [15843] 

(View GenBank report,FASTA report^\SN. 1 report,Graphical view, 1 MEDLINE link, 1 protein link, or 14 nucleotide neighbors ) 
Z78095 

Bacteriophage T6 DNA (1506 bp) 

gil !488562|embI278095|BPHZ78095 [1488562] 

(View GenBank repooFASTA reporvASN. 1 report,Graphicai view, 1 protein link, or 4 nucleotide neighbors ) 
235079 

Bacteriophage T6 DNA for Ip5, Ip6 
gi|535215|emb|235079|MY57BT6 [535215] 

(View GenBank report,FASTA reporvASN. 1 report,Graphical view,l MEDLINE link, 2 protein links, or I nucleotide neighbor ) 
X68725 

E.coli bacteriophage T6 gene for beta-glucosyl-HMC-alpha-glucosyl-transferase 
gi|296439|emb|X68725|ECT6 [296439] 

(View GenBank repon,FASTA reporWSN.l report,Graphical view,l MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
X69894 

Bacteriophage T6 alt gene for ADP-Ribosyltransferase 
gi|I5422|einb|X69894!MVT6ADP [15422] 

(View GenBank report^ASTA report^SN.l report,Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
L46846 

Bacteriophage T6 frd3, frd2 genes, complete cds 
gi|95l390|gb|L46846|PT6FRD32G [951390] 

(View GenBank report^ASTA reporvASN. 1 report,Graphical view, or 2 protein imirt ) 
M27738 

Bacteriophage T6 translational repressor protein (regA), complete cds 
gi|215993|gb|M27738|PT6REGA [215993] 

(View GenBank rcportJASTA reporvASN. 1 report,Grapbical view, 1 MEDLINE link, 1 protein link, or 5 nucleonde neighbors ) 
M38465 

Bacteriophage T6 DNA ligase gene, complete cds 
gi|2 1 599 1 |gb|M38465|PT6LIG55 [2 1 5991 ] 

(View GenBank report,FASTA reporvASN. 1 report,Graphical vicw,l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
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V01 146 
Genome of bacteriophage T7 
gi|431 187|emb|V01 146|T7CG [431 187] 

(View GenBank report,FASTA reportASN. 1 rcport.Graphical view, 13 MEDLINE links, 60 protein links, 105 nucleotide 
neighbors, or I genome link ) 

X60322 

Bacteriophage alpha3 genes A, B t K, C, D, E, J, F, G, H 
gi|I4775|emb|X60322|BACALPHA [14775] 

(View GenBank report,FASTA report,ASN. 1 repon,Graphical view,I MEDLINE link, 10 protein links, 22 nucleotide neighbors 
or 1 genome link ) 

X13332 

Bacteriophage alpha3 DNA for origin of replication 
gi|15093|ernb|X13332|MIA3ORPL [15093] 

(View GenBank report t FASTA report^SN. 1 report,GraphicaI view, or 1 MEDLINE link ) 
XI2611 

Bacteriophage alpha3 gene for protein A part, finger domain 
gi| 1 5092|emb|X 1 26 1 1 (MIA3AFIN [ 15092] 

(View GenBank rcport,FASTA report^SNM report,Graphical view, I MEDLINE link, 1 protein link, or 6 nucleotide neighbors ) 
X15721 

Bacteriophage alpha3 deletion mutation DNA for the origin region (»ori) of replication 
gi|14774|emb|X15721|BA3DMOR9 [14774] 

(View GenBank repor^FASTA report,ASN. I repon.Graphical view, I MEDLINE link, or 11 nucleotide neighbors ) 
X15720 

Bacteriophage alpha3 deletion mutant DNA for the origin region (-ori) of replication 
gi|14773|emb|XI5720|BA3DMOR8 [14773] 

(View GenBank report^ASTA report^SN.l report, Graphical view, 1 MEDLINE link, or 1 nucleotide neighbor ) 
XI5719 

Bacteriophage alpha3 insertion mutant DNA for the origin region (-ori) of replication 
gi|14772|emb|X15719|BA3DMOR7 [14772] 

(View GenBank report,FASTA report^VSN.l report,Graphical view, 1 MEDLINE link, or 10 nucleotide neighbors ) 
X15718 

Bacteriophage aipha3 deletion mutation DNA for origin region (-ori) of replication 
gi! 1477 1 (embpC 1 57 1 8|BA3DMOR6 [1477 1 ] 

(View GenBank reponJASTA report^SN. 1 report,Graphical view, 1 MEDLINE link, or 1 1 nucleotide neighbors ) 
X15717 

Bacteriophage alpha3 deletion mutatnt DNA for origin region (-ori) of replication 
gUI4770|emb|X15717|BA3DMOR5 [14770] 

(View GenBank reportJASTA report,ASN. 1 report, Graphical view, 1 MEDLINE link, or 9 nucleotide neighbors ) 
X15716 

Bacteriophage alpha3 deletion mutant DNA for origin region (-ori) of replication 
gi|l4769|emb|X15716|BA3DMOR4 [14769] 

(View GenBank reporttFASTA repor^ASN.l report,Graphical vicw,l MEDLINE link, or 10 nucleotide neighbors ) 
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XI5715 

Bacteriophage alpha3 deletion mutant DNA for origin region (-on) of of replication 
gi|14768|emb|X15715|BA3DMOR3 [14768] 

(View GenBank report,FASTA rcpon,ASKl report.Graphical view,! MEDLINE link, or 1 1 nucleotide neighbors ) 
X15714 

Bacteriophage alpha3 deletion mutant DNA for origin region (-ori) of replication 
gi|14767|emb|X15714|BA3DMOR2 [14767] ~ 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, or 1 1 nucleotide neighbors ) 
X15713 

Bacteriophage alpha3 delerion mutant DNA for the origin region (-ori) of replication 
gi|14766|emb|X15713|BA3DMORl [14766] 

(View GenBank report t FASTA report^SN.l report,Graphical view.l MEDLrNE link, or 11 nucleotide neighbors ) 
X62059 

Bacteriophage alpha3 origin ofcDNA synthesis (oriGA) 
gi|14763|emb|X62059|AL3ORIGA [14763] 

(View GenBank report,FASTA reporUSN.l report,Graphical view,l MEDLINE link, or 13 nucleotide neighbors ) 
X62058 

Bacteriophage alpha3 origin of cDNA synthesis (oriAA) 
gi|14762|exnb|X62058|AL3ORlAA [14762] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view, 1 MEDLINE link, or 13 nucleotide neighbors ) 
J02444 

Bacteriophage alpha3 origin of DNA replication 
gi|166103|gb|J02444|AL3ORI [166103] 

(View GenBank report,FASTA report^SN.l report,GraphicaI view, 1 MEDLINE link, 2 protein links, or 12 nucleotide neighbors ) 
M25640 

Bacteriophage alpha-3 H protein gene, complete cds 
gi|166101|gb|M25640|AJL3HP [166101] 

(View GenBank report^ASTA reporUSN. 1 report,Graphical view,! MEDLINE link, 1 protein link, or 13 nucleotide neighbors ) 
M10631 

Bacteriophage alpha-3 cleavage site for phage phi-X174 gene A protein 
gi|166099Igb|M10631|AL3CSA (166099] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view, 1 MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
X00774 

Bacteriophage alpha-3 gene J sequence 
gi|15431|embIX00774(NCBA3J [15431] 

(View GenBank report,FASTA reportASN.l report,GraphicaI view,l MEDLINE link, 3 protein links, or 2 nucleotide neighbors ) 
M25640 

Bacteriophage alpha-3 H protein gene, complete cds 
gi|l66101|gb|M25640|AL3HP [166101] 

(View GenBank report,FASTA reportASN.l report,Graphical view,l MEDLINE link, 1 protein link, or 13 nucleotide neighbors ) 
M10631 

Bacteriophage alpha-3 cleavage site for phage phi-X174 gene A protein 
gi|166099|gb|M10631|AL3CSA [166099] 

(View GenBank report,FASTA rcporUSN.l report,Graphical view.l MEDLINE link, I protein link, or 3 nucleotide neighbors ) 
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Bacteriophage lambda, complete genome 
gi|215104|gb|J02459|LAMCG [215104] 

(View GenBank report,FASTA report,ASN. I report,Graphical view.87 MEDLINE links, 67 protein links, 190 nucleotide 
neighbors, or 1 genome link ) 

J02482 

Bacteriophage phi-X174, complete genome 
gi|216019|gb|J02482|PXlCG [216019] 

(View GenBank repon,FASTA report,ASNl report,Graphicai view,23*MEDLINE links, 1 1 protein links, 26 nucleotide neighbors, 
or 1 genome link ) 

J02454 

Bacteriophage G4, complete genome 
gi|2 1 54 1 5|gb|J02454|PG4CG [2 1 54 15] 

(View GenBank report,FASTA report,ASN.i rcport,Graphical view,6 MEDLINE links, 1 1 protein links, 20 nucleotide neighbors 
or 1 genome link ) 

X60323 

Bacteriophage phiK complete genome 
gi|l4781 18!emb|X60323|BPHIKCG [1478118] 

(View GenBank report,FASTA repon^SN.l report,Graphical view,10 protein links, 18 nucleotide neighbors, or 1 genome link ) 
L42820 

Bacteriophage BF23 tail protein (hrs) gene, complete cds 
gi|1048680jgb|L42820|BBFHRS [1048680] 

(View GenBank report,FASTA reporvASN.l report,Graphical view,l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
X54455 

Bacteriophage BF23 gene 17 and gene 18 
gi|14797|emb|X54455|BF231718G [14797] 

(View GenBank report,FASTA report^VSRl report,Graphical view,2 protein links, or 2 nucleotide neighbors ) 
M37097 

Bacteriophage BF23 DNA, right end of terminal repetition 
gi|l661 15|gb|M37097|BBFRIGH [166115] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l MEDLINE link, or 2 nucleotide neighbors ) 
M37096 

Bacteriophage BF23 DNA, left end of terminal repetition 
gi|1661 14|gb|M37096IBBFLEFT [1661 14] 

(View GenBank report^ASTA report^ASN.i report,Graphical view, I MEDLINE link, or 1 nucleotide neighbor ) 
M37095 

Bacteriophage BF23 A2-A3 gene, complete cds, and Al gene, 5* end 
gi|166110|gb|M37095|BBFA2A3 [166110] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,2 MEDLINE links, 3 protein links, or 1 nucleotide neighbor ) 
AF056281 

Bacteriophage BF23 clone bf23.mac5/6.1, genomic survey sequence 

gi)3090930tgbIAF056281|AF056281 [3090930] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 
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AF056280 

Bacteriophage BF23 clone bf23.mac3, genomic survey sequence 

gi|3O90929|gb|AFO56280|AFO56280 [3090929] 

(View GenBank report ( FASTA report,ASNU report, or Graphical view) 

AF056279 

B nSo?l C , BF23 d0ne b ^3 macI8/21.34 t genomic survey sequence 
gi!3090928|gb|AF056279IAF056279 [3090928J • 
(View GenBank repon,F ASTA report,ASN. 1 report, or Graphical viewj" 

AF056278 

B ?, C S^ 8C BF23 c!one bf23 -™cl6719.33, genomic survey sequence 

gi|3090927|gb|AF056278|AF056278 [3090927] 

(View GenBank report,FASTA report,ASN. 1 report, or Graphical view) 

AF056277 

Bacteriophage BF23 clone bf23.mac 1671 9-33, genomic survey sequence 

gi|3090926|gb|AF056277|AF056277 [3090926] 

(View GenBank report,FASTA reporUSN.l report, or Graphical view) 

AF056276 

Bacteriophage BF23 clone b£23. mac 12/9-9, genomic survey sequence 

giI3090925|gb|AF056276|AF056276 [3090925] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 

AF056275 

Bacteriophage BF23 clone bf23.macl 1/14-24, genomic survey sequence 

giI3090924|gb|AF056275|AF056275 [3090924] 

(View GenBank reportJASTA reporUSN.l report, or Graphical view) 

AF056274 

Bacteriophage BF23 clone bf23.57r64r, genomic survey sequence 
gi|3090923|gb|AF056274|AF056274 [3090923] 

(View GenBank rcport^ASTA report^SN.l report,Graphicai view, or 3 nucleotide neighbors ) 
AF056273 

Bacteriophage BF23 clone bf23.54fr, genomic survey sequence 

gi|3090922|gb|AF056273|AF056273 [3090922] 

(View GenBank report ? FASTA report,ASN.l report, or Graphical view) 

AF056272 

Bacteriophage BF23 clone bf23.47fr.mac 10/7, genomic survey sequence 

gi|3090921|gb|AF056272!AF056272 [3090921] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 

AF056271 

Bacteriophage BF23 clone bG3.23.66r, genomic survey sequence 

gi|3090920|gb|AF05627l|AF056271 (3090920] 

(View GenBank report^ASTA rcport^SN.l report, or Graphical view) 

AF056270 

Bacteriophage BF23 clone bf23.23.64f, genomic survey sequence 

gi|30909l9|gb|AF056270|AF056270 [3090919] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 
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AF056269 

Bacteriophage BF23 clone bf23.23.60r, genomic survey sequence 

gi|30909l8|gb|AFO56269|AFO56269 [3090918] 

(View GenBank repon,FASTA report,ASN.l report, or Graphical view) 

AF056268 

Bacteriophage BF23 clone bf23.23.60f, genomic survey sequence 
gi|3090917|gb|AF056268|AF056268 [3090917] 

(View GenBank report,FASTA report,ASN. I rcport.Graphical view, or i nucleotide neighbor ) 
AF056267 

Bacteriophage BF23 clone bf23.23.59r, genomic survey sequence 
gi|3090916|gb|AF056267|AF056267 [3090916] 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 
AF056266 

Bacteriophage BF23 clone bf23.23.59f, genomic survey sequence 

gi|309091 5(gb|AF056266|AF056266 [3090915] 

(View GenBank report,FASTA report.ASN.1 report, or Graphical view) 

AF056265 

Bacteriophage BF23 clone bf23.23.56r, genomic survey sequence 

gi|3090914|gb|AF056265|AF056265 [3090914] 

(View GenBank report,FASTA reporVVSN.l report, or Graphical view) 

AF056264 

Bacteriophage BF23 clone bf23.23.56f f genomic survey sequence 

gi|30909l3|gb|AF056264|AF056264 [3090913] 

(View GenBank reportJASTA report^ASN.l report, or Graphical view) 

AF056263 

Bacteriophage BF23 clone bf23.23.68135r, genomic survey sequence 
gi|3090912|gb|AF056263|AF056263 [3090912] 

(View GenBank rcport,FASTA reporvASN.l report, or Graphical view) 
AF056262 

Bacteriophage BF23 clone bf23.23.43fr.66f, genomic survey sequence 

gi|309091 l|gb|AF056262|AF056262 [3090911] 

(View GenBank report,FASTA report,ASN.l report, or Graphical view) 

AF056261 

Bacteriophage BF23 clone bf23.23.2fr, genomic survey sequence 

gi|3090910|gb|AF056261|AF05626i [3090910] 

(View GenBank report^ASTA report^VSN.l report, or Graphical view) 

AF056260 

Bacteriophage BF23 clone bf23.23.55.f; genomic survey sequence 

gi|3090909|gb|AF056260|AF056260 [3090909] 

(View GenBank report,FASTA report^ASN.l report, or Graphical view) 

AF056259 

Bacteriophage BF23 clone bf23.23.53.r, genomic survey sequence 

gi|3090908|gb|AF056259|AF056259 [3090908] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 
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AF056258 

Bacteriophage BF23 clone bf23.23.53.f, genomic survey sequence 

gi|3090907|gbIAFO56258|AFO56258 (3090907] 

(View GenBank report.FASTA report,ASN. 1 report, or Graphical view) 

AF056257 

Bacteriophage BF23 clone bf23.23.52.r, genomic survey sequence 
gi|3090906|gb|AF056257|AF056257 [3090906] 

(View GenBank reportJASTA report,ASN. 1 report, or Graphical view)* 
AF056256 

Bacteriophage BF23 clone bf23.23.52.f, genomic survey sequence 
gi|3O9O905|gb|AF0S6256|AFO56256 [3090905] 

(View GenBank report,FASTA report^ASN. 1 report, or Graphical view) 
AF056255 

Bacteriophage BF23 clone bf23.23.49.r, genomic survey sequence 

gi|3O9O904|gb|AF056255|AFO56255 [3090904] 

(View GenBank report,FASTA report,ASN. 1 report, or Graphical view) 

AF056254 

Bacteriophage BF23 clone bf23.23.49.f, genomic survey sequence 

gi|3090903|gb|AF056254|AF056254 [3090903] 

(View GenBank report^ASTA report^ASN. 1 report, or Graphical view) 

AF056253 

Bacteriophage BF23 clone bf23.23.48.r, genomic survey sequence 

gi|3090902|gb|AF056253|AFO56253 [3090902] 

(View GenBank reportJFASTA report*ASN. I report, or Graphical view) 

AF056252 

Bacteriophage BF23 clone bf23.23.48.f, genomic survey sequence 

gi|309090l|gb|AF056252|AF056252 [3090901] 

(View GenBank report,FASTA reportASN.l report, or Graphical view) 

AF056251 

Bacteriophage BF23 clone bf23.23.44.r, genomic survey sequence 

gi|3090900|gb|AF056251|AF056251 [3090900] 

(View GenBank reportfASTA repor^ASN.l report, or Graphical view) 

AF056250 

Bacteriophage BF23 clone bf23.23.41.f, genomic survey sequence 

gi|3090899|gb|AF056250(AF056250 [3090899] 

(View GenBank reportJASTA report^ASN. 1 report, or Graphical view) 

AF056249 

Bacteriophage BF23 clone bf23.23.22.ar, genomic survey sequence 

gi|3090898|gb|AF056249IAFO56249 [3090898] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 

AF056248 

Bacteriophage BF23 clone b£23.23.22.a.f, genomic survey sequence 

gt|309O897|gb|AFO56248!AFO56248 [3090897] 

(View GenBank reportJFASTA reporVASN. i report, or Graphical view) 
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AF056247 

Bacteriophage BF23 clone bf23.23.68.r, genomic survey sequence 

gi|3090896|gb|AF056247|AF056247 [3090896] 

(View GenBank rcport,FASTA report,ASN.l report, or Graphical view) 

250114 

Bacteriophage BF23 DNA for putative tail protein gene 
gi|2464952|emb|250l 14|BF23LATE [2464952] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, orj protein link ) 
D12824 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view. 1 MEDLINE link, 2 protein links, or 3 nucleotide neighbors ) 
Z34953 

Bacteriophage K3 ip9, ip7 and ip8 genes 
giJ535261|emb|234953|MYK3IP978 [535261] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
235075 

Bacteriophage K3 DNA for Ip3 and Ip4 
gi|535229|cmb|Z35075|MYEORF64K [535229] 

(View GenBank report,FASTA reporUSN. I report,Graphicai view, 1 MEDLINE link, or 2 protein links ) 
X05560 

Bacteriophage K3 gene 38 for receptor recognizing protein 
gi|15U2|ernb|X05560|MYK3G38 [15112] 

(View GenBank report,FASTA reporUSN. 1 report,Graphical view, 1 MEDLINE link, or 1 protein link ) 
X04747 

Bacteriophage K3 gene 37 for receptor reco gnizin g protein 
gill5110|emb|X04747|MYK3G37 [15110] 

(View GenBank report,FASTA report^SN.l report,Graphical view, I MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
X01754 

Bacteriophage K3 tail fiber gene 36 
gi]15108|emb|X01754|MYK3F36 [15108] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view, 1 MEDLINE link, or 2 protein links ) 
M 168 12 

Bacteriophage K3 V lysis gene, complete cds 
gi|215503|gb|M16812|PK3LYST [215503] 

(View GenBank report.FASTA repor^ASN. 1 report,Graphical view, 1 MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
L46833 

Bacteriophage K3 frd3, frd2 genes, complete cds 
gi|951377|gb|L46833|PK3FRD32G [951377] 

(View GenBank rcport,FASTA report^SN.l report,Graphical view,2 protein links, or 2 nucleotide neighbors ) 
L43613 

Bacteriophage K3 fibririn (wac) gene, complete cds 
gi|90386 1 |gb|L436 1 3|PK3 WAC [90386 1 ] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l protein link, or 4 nucleotide neighbors ) 



WO 00/32825 



PCT/IB99/02040 



232 

X01753 

Bacteriophage 0x2 tail fiber gene 36 

gi| 15 1 22|emb!X01 753IMYOX2F36 [ 1 5 122] 

(View GenBank repon.FASTA report.ASN.1 report,Graphica. view,. MEDLINE link. 2 protein links, or 1 nucleotide ne.ghbor ) 
L43612 

Bacteriophage 0x2 fibritin (wac) gene, complete cds 
gi|903848|gb|L4361 2|OX2 WAC [903848] 

(View GenBank repon.FASTA report,ASN.l repon,Graphical view.l protein link, or 4 nucleotide neighbors ) 

Z46880 
Bacteriophage 0X2 stp gene 
gi|599663|emb|Z46880|BPOX2STP [599663] 

(View GenBank repon,FASTA rcport,ASN. 1 report,Graphical view, 1 MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
X05675 

B ff! e .l!F ha ! e 0x2 8ene 38 for reeeptor-recognizing protein and flanking regions 
gi|l5l24|emb|X05675|MYOX2G38 (15124] "among regions 

(View GenBank report,? ASTA report.ASN.1 report,Graphica. view.l MEDLINE link. 3 protein links, or 1 nucleotide neighbor , 
M33533 

(V,ew GenBank report,FASTA repo«.ASN. 1 repor,Gra P hical view. , MEDLINE link. 2 protein links, or 2 nucleotide neighbors , 
AF033329 

giSS^ 

(V.ew GenBank rcponJASTA report^SN.I report,Graphical view.l protein link, or 1 1 nucleotide neighbors ) 
M86231 

Bacteriophage RB69 gene 62, 3'end; RegA (regA) gene, complete cds 
gi|215354|gb|M86231|P6962REGA [215354] 

(View GenBank report,FASTA report.ASN.1 report.Graphical view.l MEDLINE link. 2 protein links, or 1 nucleotide neighbor ) 
AF033332 

(V,ew GenBank report,FASTA reporUSN.l repon,Grapbical view.l protein link, or 12 nucleotide neighbors ) 
U34036 

Bacteriophage RB69 DNA polymerase (43) gene, complete cds 
gi|1237125|gb|U34036|BRU34036 [1237125] 

(View GenBank reportJASTA reportASN.l report,Graphical view.l MEDLINE link, or 1 protein link ) 
V01 145 

^sss^sssssssr Each l ** m givcn " sequence represcnts a 

gi|15557|emb|V01 145|PODOHl [15557] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, or i MEDLINE link ) 
X05676 

Bacteriophage Ml gene 38 for receptor recognizing protein and flanking regions 
gi|151I4|emb|X05676|MYMlG38 [15114] 

(View GenBank reportJASTAreporWSN.l report,Graphical view, 1 MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
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AF034575 

(V.ew GenBank report,FASTA report.ASN.l report,Graplucal view,! MIDLINE link, or 1 protein link ) 
AF033321 

(V,ew GenBank report.FASTA report.ASN.1 repor,,Graphical v,ew,l p^tein link, or 17 nucleotide neighbors ) 
X55190 

(View GenBank report,FASTA repon.ASN. 1 report,Gra P hical view, . MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
AF033334 

( V,ew GenBank reponJASTA repoit,ASN. 1 report,Graphical view, or 5 nucleotide neighbors ) 
X5519I 

SEtS^ *°" reCeptor - reC0 ^ »«* 37 «=ds), 38 gene for recepror-recognizing protein 38. 
gi|!4863|emb|X55191|BPTUIB [14863] 

(View GenBank reportJASTA report,ASN.l report,Graphical view,l MEDLINE link, 3 protein links, or 3 nucleotide neighbors ) 
X13065 

Bacteriophage phi80 early region 
gi|l4800femb|X13065|BP80ER [14800] 

(View GenBank report^ASTA report,ASN.l report,Graphical view.! MEDLINE link, 8 protein links, or 6 nucleotide neighbors ) 
D00360 
Bacteriophage phi80 cor gene 
gi|217782|dbj|D00360|P8080COR [217782] 

(View GenBank rcport,FASTA report,ASN.l report,Graphical view, or 1 protein Hnk ) 
X01639 

Bacteriophage phi 80 DNA-fragment with replication origin 
gi|15828|emb|X01639|XXPffl80 [15828] 

(View GenBank report,FASTA reporUSN. 1 report,Graphical view, I MEDLINE link, or 25 nucleotide neighbors ) 
X04051 

Lambdoid bacteriophage phi 80 int-xis region (integrase-excisionase region) 
gi|15770|emb|X04051|STPHI80X [15770] 

(View GenBank rcport,FASTA report^SN.l report,Graphical view.l MEDLINE link, 2 protein links, or 1 nucleotide neighbor ) 
X06751 

Phage Phi80 DNA for major coat protein 
gi( l5768|emb|X0675 1|STPHI80C [15768] 

(View GenBank report.FASTAreport^SN.1 report,Graphical view, 1 MEDLINE link, 1 protein link, or II nucleotide neighbors ) 
X75949 

Bacteriophage phi80 DNA for ORF xl71.S and ORF xl71.28' 
gi|45881 l|emb|X75949|ECORF171B [45881 1] 

(View GenBank reportJASTA repoOASN.l repon,Graphicai view, I MEDLINE link, 2 protein links, or 28 nucleotide neighbors ) 
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L40418 

Bacteriophage phi-80 gene, complete cds 
git 1 0 1 9 1 07|gb|L404 1 8|P80A [ 1 0 1 9 107] 

(View GenBank report.FASTA repor^ASN.l repon,Graphical view,! MEDLINE link, or 1 protein link ) 
M24831 

Bacteriophage phi-80 Tyr-tRNA gene, 3* end 
gi|2 1 5363|gb|M2483 1 (P80TGY f 2 15363] 

(View GenBank report.FASTA report.ASN.1 report.Graphical view, I MEDLINE link, or 43 nucleotide neighbors ) 
M 10670 

Bacteriophage phi-80 replication origin 
gi|2 15361 |gb|M 1 0670|P80ORI [2 1 536 1 ] 

(View GenBank report,FASTA rcporVVSN.l report,Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
M24825 

Bacteriophage phi-80 RNA fragment 
gif215360|gb|M24825|P80M3A [215360] 

(View GenBank report,FASTA reporUSN.l report,Graphical view,l MEDLINE link, or 1 nucleotide neighbor ) 
M1I919 

Bacteriophage phi-80 cl immunity region encoding the N gene 
gi|215358|gb|Ml 1919|P80CI [215358] 

(View GenBank report,FASTA reporUSN. 1 report,Graphical view,l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
MI0891 

Bacteriophage phi-80 attP site DNA 
gi|215357|gb|M10891iP80ATTP [215357] 

(View GenBank report,FASTA reporUSN. 1 report,Graphical view, I MEDLINE link, or 1 nucleotide neighbor ) 
Ml 9473 

Bacteriophage 933J (from E.coli) proviral Shiga-like toxin type 1 subunits A and B genes, complete cds 
gi|215072|gb|M19473|J93SLTI [215072] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view,2 MEDLINE links, 2 protein links, or 20 nucleotide neighbors ) 
Y10775 

Bacteriophage 933W ileX, stx2A and stx2B genes 
gi|l938206|emb|Y10775IBP933ILEX [1938206] 

(View GenBank rcport t FASTA report^ASN.l report,Graphical view,2 protein links, or 36 nucleotide neighbors ) 
X83722 

Bacteriophage 933W slt-HB gene 

gt| l490229|emb|X83722|B933WSLT [1490229] 

(View GenBank reportJASTA report^SN.l report,Graphical view,2 protein links, or 20 nucleotide neighbors ) 
X07865 

Bacteriophage 933W slt-D gene for Shiga-like toxin typell subunit A and B 
gi|14892|emb|X07865|BWSLTn [14892] 

(View GenBank rcpon,FASTA report^\SN. 1 report,Graphical vie w,2 protein links, or 29 nucleotide neighbors ) 
M16625 

Bacteriophage H19B (from E.coli) sltlA and sltlB genes encoding Shiga-like toxin I subunits A and B, complete cds 
gi|215043|gb|M16625|H19BSLT [215043] 

(View GenBank report,FASTA reporg\SN. 1 report,Graphical view, I MEDLINE link, 2 protein links, or 24 nucleotide neighbors ) 
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MI7358 - JJ 
Bacteriophage HI 9B shiga-like toxin- 1 (SLT-I) A and B subunit DNA, complete cds 
gi|215046|gb|M17358|Hl9BSLTA (215046) 

(View GenBank report,FASTA report,ASN.l repon,Graphical view,l MEDLINE link, 2 protein links, or 20 nucleotide neighbors ) 
U29728 

Bacteriophage N4 single-stranded DNA-binding protein (N4SSB) gene, complete cds 
gi|939708|gb|U29728|BNU29728 [939708] 

(View GenBank report t FASTA report,ASN 1 report,Graphical view,2 MEDLINE linki, or 1 protein link ) 
J02580 

Bacteriophage PA-2 (E.coli porcine strain isolate) Rz gene, 5'end; ORF2, outer membrane porin protein (lc) and ORF1 genes, 
complete cds 

gi|2l5366|gb|J02580|PA2LC (215366] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, 4 protein links, or 4 nucleoride neighbors ) 
U32222 

Bacteriophage 186, complete sequence 
gi|3337249|gb|U32222|B 1U32222 [3337249] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view,6 MEDLINE links, 46 protein links, or 5 nucleoride neighbors ) 
X51522 

Bacteriophage P4 complete DNA genome 
gi|450916|emb|X51522|MYP4CG [450916] 

(View GenBank reportJASTA repoitASRl report,Grapbical view,3 MEDLINE links, 13 protein links, 6 nucleoride neighbors 
or 1 genome link ) 

X92588 

Bacteriophage 82 orf33, orf!51, orf36, orf96, rus, orf45, and Q genes 
gi|1051111|emb|X92588|BAC82HOLL [1051111] 

(View GenBank repon.FASTA repoi%ASN. 1 report,Graphical view,7 protein links, or 1 nucleotide neighbor ) 
J02803 

Bacteriophage 82 anmennination protein (Q) gene, complete cds 
gi|2I5364|gb|J02803|P82Q [215364] 

(View GenBank report,FASTA rcport^SN.l report,Grapbical view,l MEDLINElink, or 1 protein link ) 
U02466 

Bacteriophage HK022 (cro), (ell) and (O) genes, complete cds, (P) gene, partial cds 
gi|407285|gb|U02466|BHU02466 [407285] 

(View GenBank report,FASTA report^SN. I report,Graphical view.l MEDLINE link, 5 protein links, or 1 nucleoride neighbor ) 
M26291 

Bacteriophage D 1 08 regulatory DNA-binding protein (ner) gene, complete cds 
gi| 166I94|gb|M26291 |D1 8NER [ 166194] 

(View GenBank reportJFASTA report^SN.l report,Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
Ml 1272 

Bacteriophage D 108 left-end DNA 
gi|166I93|gb|M11272|D18LEDNA [166193] 

(View GenBank report^ASTA rcport^\SN.l report,Grapbical view, 1 MEDLINE link, or 2 nucleotide neighbors ) 
Ml 8902 

Bacteriophage D 108 kii gene encoding a replication protein, 3' end; and containing three ORFs, complete cds 
gi|166191|gb|M18902|Dl8KIL [166191] 

(View GenBank report,FASTA reportASN. 1 report,Graphical view, 1 MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
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M10191 

Bacteriophage D108, left end with Mu A protein binding sites LI and L2 
gi| 1 66 1 90{gb|M 1 0 1 9 1 |D 1 8BSL [166190] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, or 5 nucleotide neighbors ) 
J02447 

bacteriophage d 108 gene a 5' end 

gi| 1 66 1 89|gb| J02447|D 1 8 AAA [1661 89] 

(View GenBank report,FASTA rcport,ASN.I report,Graphical view, or*l MEDLINE link ) 
V00865 

Bacteriophage DI08 fragment from genes A and ner (C-terrninus of ner and N-tcrminus of A) 
gi|15437|emb|V00865|NCDI08 [15437] 

(View GenBank report.FASTA report^ASN.l report,Graphical view,l MEDLINE link, or 2 protein links ) 
X01914 

Bacteriophage UCc gene for DNA binding protein 
gi|14957iemb|X01914|INDCEDBP [14957] 

(View GenBank report^ASTA report^SN. I report,Graphical view, 1 MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 

AF064539 
Bacteriophage N15, complete genome 
gi|3 192683|gb|AF064539|AF064539 [3 192683] 

(View GenBank repon^ASTA reporUSN. 1 report.Graphicai view f 2 MEDLINE links, 60 protein links, 26 nucleotide neighbors 
or 1 genome link ) s 

UO2303 

Bacteriophage Ifl, complete genome 
gi|3676280|gbIU02303|B2U02303 [3676280] 

(View GenBank report,FASTA reporr^ASN. 1 repon,Graphical view, 1 0 protein links, or 1 genome link ) 
AF007792 

Bacteriophage Mu late morphogenetic region 
gi|3551775|gb|AF007792|AF007792 [3551775] 

(View GenBank report,FASTA report^SN.l report,Graphical view, or 1 nucleotide neighbor ) 
U24159 

Bacteriophage HP1 strain HPlcl, complete genome 
gi| 1046235|gb|U24 1 59|BHU24 1 59 [ 1 046235] 

(View GenBank report,FASTA reporUSN.l report,Graphical view,6 MEDLINE links, 41 protein links, 8 nucleotide neighbors 
or 1 genome link) 

Z7I579 

Bacteriophage S2 type A 5.6 kb DNA fragment 
gi|1679806|emb|271579|BPHSlADNA [1679806] 

(View GenBank reportJASTA reporvASN. 1 report,Graphical view,3 MEDLINE links, 9 protein links, or 9 nucleotide neighbors ) 
X53238 

Klebsiella sp. bacteriophage Ki 1 gene 1 for RNA polymerase 
gi|14984|emb|X53238|KSKl 1RPO [14984] 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
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X850IO 

Bacteriophage A5I I p!y51 1 gene 
gi|853748|emb|X85010|BPA511PLY [853748] 

(View GenBank report,FASTA report, ASN. 1 report,Graphical view, 1 MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
U29728 

Bacteriophage N4 single-stranded DNA-binding protein (N4SSB) gene, complete cds 
gi|939708|gb|U29728|BNU29728 [939708] 

(View GenBank report,FASTA report,ASN. 1 report,Graphical view,2 tiEDLINE links, or 1 protein link ) 
J02445 

bacteriophage bol 3-terminal region ma 
gi|166152|gb|J02445|BOlTR3 [166152] 

(View GenBank rcport,FASTA report,ASN\l report,Graphical view,l MEDLINE link, or 5 nucleotide neighbors ) 
L06I83 

Bacteriophage L5 (from Leuconostoc oenos) genome 
gi|289353|gb|L06183IBL5GENM [289353] 

(View GenBank report^ASTA report^iSN. 1 report,Graphical view, or 1 genome link ) 
AF074945 

Mycoplasma arthri ndis bacteriophage MAV1, complete genome 
gi|35 1 1243|gb|AF074945|AF074945 [35 1 1243] 

(View GenBank report,FASTA rcporWSN.l repon,Graphical view, 15 protein links, 3 nucleotide neighbors, or 1 genome link ) 
L13696 

Bacteriophage L2 (from Mycoplasma), complete genome 
gi|289338|gb|L13696|BL2CG [289338] 

(View GenBank report,FASTA reporUSN.l report,Graphical view f 3 MEDLINE links, 14 protein links, or 1 genome link ) 
X8019I 

Bacteriophage PP7 mRNA for maturation, coat, lysis and replicase proteins 
gi|517237|emb|X80191|BPP7PR [517237] 

(View GenBank report,FASTA report^SN.l report, Graphical view, I MEDLINE link, 4 protein links, or 1 genome link ) 
M19377 

Bacteriophage Pf3 from Pseudomonas aeruginosa (New York strain), complete genome 
gi|215380|gb|M19377|PF3COMNY [215380] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view,l MEDLINE link, 9 protein links, or 5 nucleotide neighbors ) 
M11912 

Bacteriophage PG from Pseudomonas aeruginosa (Nijmegen strain), complete genome 
gi|2l5371|gb|Ml 1912|PF3COMN [215371] 

ienomfS) 111 ' rep0ItFASTA re P ort ^ SNJ ^oit,Graphical view,l MEDLINE link, 9 protein links, 5 nucleotide neighbors, or 1 
V00605 

Bacteriophage Pfl gene encoding DNA binding protein 
gt|14970|emb|V00605|INOPFl [14970] 

(View GenBank reportJFASTA report,ASN. 1 report,Graphical view, 1 protetne link, or 1 nucleotide neighbor ) 
L05626 

Bacteriophage PR4 capsid protein (P6) gene, complete cds 
gi|2l5735|gb|L05626|PR4P6MAJA [215735] 

(View GenBank report,FASTA rcporUSN.l report,Graphical view,l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
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D 13409 

Bacteriophage phiCTX (isolated from Pseudomonas aeruginosa) cosR. attP. int genes 
gi|217776|dbj|D13409|BPHCOSR [217776] • 

(View GenBank report.FASTA report.ASN.1 report,Gra P hical view.l MEDLINE link, 3 protein links, or 3 nucleotide neighbors ) 
D 1 3408 

Bacteriophage phiCTX (isolated from Pseudomonas aeruginosa) cosL, ctx genes 
gi|217775|dbj|D13408|BPHCOSLCTX [217775] 8 

(View GenBank report,FASTA report.ASN. 1 report,Graphical view,2 MEDLINE links, or 3 nucleotide neighbors ) 
M24832 

Bacteriophage f2 coat protein gene, partial cds 
gi|166228|gb|M24832|F2CRNACA [166228] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view.l MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
S72011 

^^iX 1 " 86 21 isocitrate dehydrogenase (icd) and integrase (int) genes,partial cds 
gi|2618967|gb|AF017629|AF017629 [2618967] .P««aicas 

(View GenBank report,FASTA reporUSN. 1 report,Graphical view.l MEDLINElink, 2 protein links, or 44 nucleotide neighbors ) 
AF0I7628 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|2618964|gb|AF017628|AF017628 [2618964] 

(View GenBank report.FASTA report^SN.l report,Graphical view.l MEDLINElink. 2 protein links, or 44 nucleotide neighbors ) 
AF017627 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|2618961|gb|AF017627|AF017627 [2618961] 

(View GenBank reportJFASTA report^SN.l report,Graphical view.l MEDLINElink, 2 protein links, or 44 nucleotide neighbors ) 
AF017626 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) gene, partial cds; and integrase (int) gene, partial cds 
gt|2618958|gb|AF017626|AF0I7626 [2618958] ; gene, pamai cos 

(View GenBank report,FASTA report^SN. 1 report,Graphical view. 1 MEDLINE link. 2 protein links, or 49 nucleotide neighbors ) 
AF017625 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|261 8955|gb|AF0 1 7625|AF0 1 7625 [26 1 8955] 

(View GenBank report,FASTA report^SN. 1 report,Grapbical view. 1 MEDLINElink, 2 protein links, or 44 nucleotide neighbors ) 
AF017624 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (inrtgenes, partial cds 
gi|2618952|gb|AF017624|AF017624 [2618952] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINElink, 2 protein links, or 44 nucleotide neighbors ) 
AF017623 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|2618949|gb|AF017623|AF017623 [2618949] 

(View GenBank report.FASTA report^SN.l report,Graphical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AF017622 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|2618946|gb|AF017622|AF017622 [2618946] 

(View GenBank repoit,FASTA report.ASN.1 repott,Graphical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
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Bacteriophage 2 1 isocitrate dehydrogenase (icd) and intcgrase (int) genes, partial cds 
gi|2618943|gb|AF017621|AF01762i [2618943] 

(View GcnBank report,FASTA re P ort,ASN.l report t Graphical view, 1 MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
D26449 

B m C ^ 6 ^^! 1 7 F ■ gene f0r shcath P rotcin <*P FI ) FI1 * cnc for mbe P rotei * (gpFH) t complete cds 
gi|452 162|dbj|D26449|BPSFIFn [452162] «mpi Cre cos 

(View GcnBank report,FASTA reporVVSN.l report,Graphicai view, or 2 protein links ) 
X87627 

Bacteriophage D3 1 12 A and B genes 
gi|974768|emb|X87627|BPD31 12AB [974768] 

(View GenBank report.FASTA report, ASN.l report,Graphical view,! MEDLINElink, 2 protein links, or 1 nucleotide neighbor ) 
U32623 

Bacteriophage D3 transcriptional activator CII (ell) gene, complete cds 
gi|984852|gb|U32623|BDU32623 [984852] 

(View GenBank repon,FASTA report,ASN.l report,Graphicai view.l protein link, or 1 nucleotide neighbor ) 
L34781 

SlSS^n ISSSSffi SY 3) 8ene ' complele cds and peptidoslycaa hydro,a$e (,ytA) 8cne * partial cds 

(View GenBank report,FASTA report.ASN. 1 report,Graphical view, I MEDLINE link, 4 protein links, or 2 nucleotide neighbors ) 
L14810 

Bacteriophage P22 (gplO) gene, complete cds, and (gp26) gene, complete cds 
gi!294053|gb|L14810|P22GP1026X [294053] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view, 1 MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
X87420 

Bacteriophage ESI 8 genes 24, c2, cro, cl, 18, and oL and oR operators 
gi|I143407|embjX87420|BPESI8GEN [1 143407] 

(View GenBank report,FASTA reporVASN.l report,Graphical view,5 protein links, or 9 nucleotide neighbors ) 
L42820 

Bacteriophage BF23 tail protein (bis) gene, complete cds 
gi|1048680|gb|L42820[BBFHRS [1048680] 

(View GenBank report,FASTA reporVVSN.l report,Graphical view,l MEDLINElink, 1 protein link, or 1 nucleotide neighbor ) 
X 14980 

Bacteriophage PRD 1 XV gene for protein P 15 (lytic enzyme) 
gi|15802|emb|Xl4980|TEPRDlXV [15802] 

(View GenBank report,FASTA report r ASN.l report,Graphical view, I MEDLINElink, 1 protein link, or 4 nucleotide neighbors ) 
X06321 

Bacteriophage PRD1 gene 8 for DNA terminal protein 
gi|I5800|emb|X06321|TEPRD18 [15800] 

(View GenBank report,FASTA report^SN.l repon,Graphicai view.l MEDLINE link, 2 protein links, or 10 nucleotide neighbors ) 
X14336 

Filamentous Bacteriophage 12-2 genome 
gi|14920|emb|X14336|INBI22 [14920] 

(View GenBank rcport,FASTA report,ASN.l report,Graphical view, I MEDLINE link, 9 protein links, 1 nucleotide neighbor, or I 
genome link) 
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L05001 240 
Bacteriophage X glucosyl transferase gene, complete cds 
gi|2 1 6044|gb|L0500 1 IPXFCLUS YLT [2 1 6044 J 

(View GenBank report,FASTA repoit.ASN.1 report,Graphical view, 1 MEDLINE link, or 1 protein link ) 
M29479 

Bacteriophage p4 sid and psu genes panial cds, and delta gene, complete cds ei!2 157011 
gb[M29479|PP4SDP [215701] * 1 

(View GenBank report, FASTA report,ASN.l report,Graphical view,3 protein links, or 4 nucleotide neighbors ) 

SEG_PP4PSUSID 
Bacteriophage P4 capsid size determination protein (sid) gene, 5' end 
gi|215698|gb!lSEGJ>P4PSUSID [215698] 

(View GenBank rcport,FASTA report,ASN.l report,Graphical view,l MEDLINE link, 2 protein links, or 1 nucleotide neighbor ) 
M29650 

Bacteriophage P4 polarity suppression protein (psu) gene, complete cds 
gi|215697|gb|M29650jPP4PSUSID2 [215697] 

(View GenBank report,FASTA reporWSN.l report, or Graphical view) 
M29651 

Bacteriophage P4 capsid size determination protein (sid) gene, 5* end 
gi|215696|gb|M29651|PP4PSUSIDl [215696] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 
M27748 

Bacteriophage P4 gop, beta, and ell genes, complete cds and int gene, 3' end 
gi|215691|gb|M27748|PP4GOPBC [215691] 

(View GenBank report^ASTA reporV\SN.l report,Graphical view,l MEDLINE link, 4 protein links, or 1 nucleotide neighbor ) 
K02750 

Bacteriophage IKe, complete genome 
gi|215061|gb|K02750|IKECG [215061] 

(View GenBank report,FASTA reportASN.l report,Graphical view, 1 MEDLINElink, 10 protein links, 4 nucleotide neighbors or 1 
genome link ) 

L40418 

Bacteriophage phi-80 gene, complete cds 
gi|1019107|gb|L40418|P80A [1019107] 

(View GenBank reportJASTA report, ASN. 1 report,Graphical view,l MEDLINE link, or 1 protein link ) 
AF032122 

Bacteriophage Sill integrate (int) gene, partial cds; and bactoprenoi glucosyl transferase (bgt), and glucosyl tranferase U (gtrll) 
genes,complete cds " ° 

gi|2465412|gb|AF021347|AF021347 [2465412] 

(View GenBank reportJFASTA reportASN. 1 report,Graphical view, 1 MEDLINElink, 4 protein links, or 2 nucleotide neighbors ) 
M35825 

Bacteriophage SF6 fragment D lysozyme gene, complete cds 
gi|2l6105|gb|M358251SF6LY2 [216105] 

(View GenBank report,FASTA rcport^SKl rcport,Graphical view, or 1 protein link ) 

235479 
Bacteriophage C16 tpl gene 
gi|534936|emb|Z35479|BCl 6IP1 [534936] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
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X12638 

Bacteriophage 2 1 DNA for gene 2 
gi|296141|ernb|X12638|B21GENE2 [296141] 

(View GcnBank report,FASTA rcport,ASN.l report,Graphical view.l MEDLINE link, 1 protein link, or I nucleotide neighbor ) 
X02501 

Bacteriophage 2 1 DNA for left end sequence with genes I and 2 
gi!15825|emb|X02501|XXPHA2I [15825] 

(View GenBank repon,FASTA report,ASN.l report,GraphicaI view.l MEDLINE link, 2 protein links, or 3 nucleotide neighbors > 
M65239 

Bacteriophage 2 1 lysis genes S, R, and Rz, complete cds 
gi|2 1 5466|gb|M65239|PH2LYSGEN [2 1 5466] 

(View GenBank rcport,FASTA reporiASN.l report,Graphical view,l MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
M58702 

Bacteriophage 2 1 late gene regulatory region 
gi|215465|gb|M58702|PH2LATEGE [215465] 

(View GenBank report,FASTA reporiASN.l report,Graphical view, or 1 MEDLINE link ) 
M81255 

Bacteriophage 21 head gene operon 
gi|2l5454|gb|M81255|PH2HEADTL [215454] 

(View GenBank reportJFASTA reporvASN.l report,GraphicaI view,2 MEDLINE links, 10 protein links, or 4 nucleotide neighbors ) 
M23775 

Bacteriophage 21 glycoprotein 1 gene, complete cds, and glycoprotein gene, 5 f end 
gi|215451|gb|M23775|PH2GPA [215451] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view, I MEDLINE link, 2 protein links, or 3 nucleotide neighbors ) 
M61865 

Bacteriophage 21 excisionase (xis), integrase (int) and isocitrate dehydrogenase (icd), complete cds 
gi|215448|gb|M61865|PH22XISAA [215448] 

(View GenBank reportJASTA report^SN.l report,Graphical view,2 protein links, or 9 nucleotide neighbors ) 
S720U 

Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|26I8967|gb|AF017629|AF017629 [2618967] 

(View GenBank reportFASTA report^SN.l report,Graphical view,l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AF017628 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|2618964|gb|AF017628|AF017628 [2618964] 

(View GenBank reportFASTA report T ASN. 1 report,Graphical view, 1 MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AF017627 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|2618961|gb|AF017627|AF0I7627 [2618961] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AF017626 

Bacteriophage 21 isocitrate dehydrogenase (icd) gene, partial cds; and integrase (int) gene, partial cds 
gi|26 1 895 8|gb| AFO 1 7626| AFO 1 7626 [2618958] 

(View GenBank report,FASTA report^SN.l report,Graphical view,! MEDLINE link, 2 protein links, or 49 nucleotide neighbors ) 
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AF0I7625 

B ^ e .o« P t o 8 u. 2 ii socitrate dehydrogenase (ied) and integrase (int) genes, partial cds 
gj|26l8955|gb|AF0!7625|AF017625 (2618955) 

(View GenBank repon,FASTArepor«,ASN.l repon,Graphical view. 1 MEDLINE link, 2 protein link,, or 44 nucleotide ne.ghbors 
AFO 17624 

Bacteriophage 21 isocitrate dehydiogenase (icd) and integrase (int) genes, partial cds 
gi|26l8952|gb|AF017624|AF017624 (2618952] 

(View GenBank repon,FASTA report.ASN.1 report,Graphical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AF017623 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|26 1 8949|gb| AFO 1 7623|AF0 1 7623 (26 1 8949) 

(View GenBank report,FASTA report^SN.l report.Graphica! view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AF017622 

B ^o^fl C , 2 l isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gj|2618946|gb|AF017622|AF017622 (2618946) 

(View GenBank report,FASTA report^SN.l report.Graphical view.l MEDLINE link. 2 protein links, or 44 nucleotide neighbors ) 
AF017621 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gt|2618943|gb|AF017621|AF017621 (2618943) 

(View GenBank reportJASTA report,ASN.l report,Graphical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
M57455 

Staphylococcus aureus) staphylokinase gene, complete cds 
(View GenBank report,FASTA report,ASKl report,Graphical view.l protein link, or 9 nucleotide neighbors ) 
Y12633 

Bacteriophage 85 DNA, promoter sequence of unknown gene 

gi|2058285|ernb|Y12633|B85PROM [2058285] 

(View GenBank report,FASTA report^SN. 1 report, or Graphical view) 

X98146 

Bacteriophage PI DNA sequence around the Op88 operator 
gi|I359513|ernb|X98146|BP10P880P [1359513] 

(View GenBank report,FASTA report^SN.l report,GraphicaI view, or 1 nucleotide neighbor ) 
Y07739 

Staphylococcus phage Twort holTW, plyTW genes 
gi|2764979|emb|Y07739|BPTWGHOLG [2764979] 

(View GenBank report^ASTA report^ASN.l report,Graphical view, or 2 protein links ) 
L07580 

(View GenBank report,FASTA report^ASN.l report,Graphical view.l MEDLINE link, or 2 protein links) 
M34832 

Bacteriophage phi-1 1 integrase (int) and excisionase (xis) genes, complete cds 
gi|166157|gb|M34832[BPHINTXIS [166157] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
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M20394 

Bacteriophage phi-1 1 S.aurcus attachment site (attP) 
gi|I66156|gb|M20394|BPHATTP [166156] 

(View GenBank rcport,FASTA repon,ASN.i report,Graphical view, I -MEDLINE link, or 4 nucleotide neighbors ) 
X23128 

Bacteriophage phi- 1 3 integrase gene 
gi|758228|emb|X823 12|PHII3INT [758228] 

(View GenBank report,FASTA report,ASN. 1 rcport,Graphical view,! protein link, or 3 nucleotide neighbors ) 
X617I9 

S.aureus phi- 13 lysogen right chromosome/bacteriophaee DNA junction 
gi|46625|emb|X61719|SAP13RJNC [46625] 

(View GenBank report,FASTA repor^ASN. 1 report, Graphical view, or 1 MEDLINE link ) 
X61718 

S.aureus phi- 13 lysogen left chromosomal/bacteriophage DNA junction 
gi|46624|emb|X61718|SAP13UNC [46624] 

(View GenBank repon,FASTA repon^ASN.l report,Graphica! view, or 1 MEDLINE link ) 
X61717 

Bacteriophage phi- 13 core sequence for attachment 
gi| 14799|emb|X6 1717JBP 1 3ATTP [14799] 

(View GenBank report,FASTA reportASN. 1 repon,Graphical vie w,2 MEDLINE links, or 3 nucleotide neighbors ) 
U01875 

Bacteriophage phi-I3 putative regulatatory region and integrase (int) gene, partial cds 
gi|437118|gb|U01875|U0I875[437I18] .paxuaicos 

(View GenBank report,FASTA report^SN. 1 report,Graphical view,3 MEDLINE links, or 4 nucleotide neighbors ) 
X67739 

S.aureus Bacteriophage phi-42 attP gene 
gil i4809|emb|X67739|BPATTPA [14809] 

(View GenBank reportJASTA reporUSN.l report,Graphical view,l MEDLINE link, or 3 nucleotide neighbors ) 
U01872 

Bacteriophage phi-42 integrase (int) gene, complete cds 
gi|437115|gb|U01872|U01872 [437115] 

(View GenBank report,FASTA reporUSN.l report,Graphical view,3 MEDLINE links, 2 protein links, or 3 nucleotide neighbors ) 
X94423 

Staphylococcus aureus bacteriophage phi-42 DNA with ORFs (restriction modification system) 
gi|1771597|emb|X94423|SARMS [1771597] moomcauon system; 

(View GenBank report,FASTA report^SN. 1 report,Graphical view,2 protein links, or 1 nucleotide neighbor ) 
M27965 

Bacteriophage L54a (from S.aureus) int and xis genes, complete cds 
gi|215096|gbIM27965|L54INTXIS [215096] 

(View GenBank report,FASTA report^SN. 1 report,Graphical view, MEDLINE I link, 2 protein links, or 3 nucleotide neighbors ) 
U72397 

Bacteriophage 80 alpha holin and amidase genes, complete cds 
gi|176324l|gb|U72397|B8U72397 [1763241] 

(View GenBank report,FASTA report^SN.l repott,Graphical view,2 protein links, or 2 nucleotide neighbors ) 



WO 00/32825 



PCT/IB99/02040 



244 

AB009866 

Bacteriophage phi PVL proviral DNA, complete sequence 
gi|3341907|dbj|AB0098661AB009866 [3341907] 

(View GenBank report.FASTA report,ASN.l report,Graphicai view,63 protein links, or 1 nucleotide neighbor ) 
Z47794 . 

Bacteriophage Cp-1 DNA, complete genome 
gi|2288892|emb|Z47794|BPCPlXX [2288892] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,3 MEDLINE links, 28 protein links, 1 nucleotide neighbor, or 
1 genome link ) 

SEG_CP7RSIT 

Bacteriophage Cp-7 (S.pneumoniae) 5' inverted terminal repeat 
gi|166186|gb||SEG__CP7RSIT [166186] 

(View GenBank report,FASTA report,ASN.l report,Graphicai view, or 1 MEDLINE link ) 
Ml 1635 

Bacteriophage Cp-7 (S.pneumoniae) DNA, 3* inverted terminal repeat 
gi|166l85|gb|M11635|CP7RSIT2 [166185] 

(View GenBank report,FASTA report^iSN.l report, or Graphical view) 
Ml 1636 

Bacteriophage Cp-7 (S.pneumoniae) 5' inverted terminal repeat 
gi|l66184|gb|Ml 1636|CP7RSIT1 [166184] 

(View GenBank report JASTA report^SN.l report, or Graphical view) 

SEG_CP5RSIT 

Bacteriophage Cp-5 (S.pneumoniae), 5* inverted terminal repeat 
gi|l66181|gb||SEG_CP5RSIT [166181] 

(View GenBank report,FASTA report,ASN. I report,Graphical view, or 1 MEDLINE link ) 
Ml 1633 

Bacteriophage Cp-5 (S.pneumoniae) 3* inverted terminal repeat 
gi|166180|gb|Ml 1633|CP5RSIT2 [166180] 

(View GenBank report,FASTA reporVASN.l report, or Graphical view) 
Mil 634 

Bacteriophage Cp-5 (S.pneumoniae), 5* inverted terminal repeat 
gi|166179|gb|Mll634|CP5RSm [166179] 

(View GenBank report^ ASTA repor^ASN.l report, or Graphical view) 
M34780 

Bacteriophage Cp-9 muramidase (cpl9) gene 
gi|166187|gb|M34780|CP9CPL [166187] 

(View GenBank reportJASTA report^\SN.l report,Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
M34652 

Bacteriophage HB-3 amidase (hbl) gene, complete cds 
gi|215055|gb|M34652|HB3HBLA [215055] 

(View GenBank report,FASTA report^ASN.l report,Graphical view, I MEDLINE link, or 1 protein link ) 
U64984 

Streptococcus pyogenes phage T12 repressor, excisionase (xis), integrase(int) and erythrogenic toxin A precursor (speA) genes, 
complete cds gi|1877426|gb|U40453|SPU40453 [1877426] 

(View GenBank report,FASTA report^SKl report,Graphical view, 2 MEDLINE links, 4 protein links, or 22 nucleotide neighbors ) 
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X12375 

Phage CP-Tl (Vibrio choierac) DNA for packaging signal (pac site) 
gi|15435|emb|Xl2375[NCCPPAC [15435] 

(View GcnBank rcport,FASTA rcport,ASN.l report,Graphical view, 1 MEDLINE link, or 1 protein link ) 
AF087814 

Vibrio cholerae filamentous bacteriophage fs-2 DNA, complete genome sequence 
gi|3702207idbj|AB002632|AB002632 [3702207] 

(View GenBank report,FASTA reportASN.l report,Graphical view,! MEDLINE link, 9 protein links, or 1 genome link ) 
D83518 

Bacteriophage K VP40 gene for major capsid protein precursor, complete cds 
gi|3046858|dbj|D83518|D83518 [3046858] 

(View GenBank rcport,FASTA reporuASN.l report,Graphical view,l MEDLINE link, or 1 protein link ) 
AF033322 

Bacteriophage PST single-stranded binding protein (gene 32) gene, partial cds, and 5' region 
g q2645774igb|AF033322|AF033322 [2645774] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view.l protein link, or 17 nucleotide neighbors ) 
X94331 

Bacteriophage L cro, 24, c2, and cl genes 

gil 1 4692 1 3|emb|X9433 1 |BLCR024C [ 1 4692 13] 

(View GenBank report,FASTA report^SN.l report,Graphical view,! MEDLINE link, or 4 protein links ) 
U82619 

Shigella flexneri bacteriophage V glucosyl transferase (gtr), integrase (int) and excisionase (xis) genes, complete cds 
gi|2465470|gb)U826 19|SFU826 1 9 [2465470] 

(View GenBank report^ASTA report^SN.l report,Graphical view,! MEDLINE link, 8 protein links, or 1 nucleotide neighbor ) 
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table 12 

NCBX Entrez Nucleotide QUERY 
Key words: bacteriophage and lysis 
56 citations found (ail selected) 



AJ011581 

Bacteriophage PSl 19 lysis genes 13, 19, 15, and packaging gene 3, 
complete cds 

gil3676084JemblAJ011581IBPS011581 [3676084] 

(View GenBank report JASTA reportASN.l report,Graphical view,4 protein 
finks, or 1 nucleotide neighbor ) 

AJ011580 

Bacteriophage PS34 lysis genes 13, 19, 15, antiterminator gene 23, and 
packaging gene 3, complete cds 
giI3676078lembIAJ0H580IBPS01 1580 P676078] 

(View GenBank reportJ-ASTA report^SN.l report,Graphical view,5 protein 
links, or 2 nucleotide neighbors ) 

AJ011579 

Bacteriophage PS3 lysis genes 13, 19, 15, and packaging gene 3 
gi!3676073lemblAJ01 1579IBPS01 1579 [3676073] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view,4 protein 
links, or 1 nucleotide neighbor ) 

AF034975 

Bacteriophage H-19B essential recombination function protein (erf). Jdl 
protein (kil), regulatory protein cIII (cITJ), protein gpl7 (17), N 
protein (N). cl protein (cl), cro protein (cro), ell protein (ell), O 
protein (O), P protein (P), ren protein (ren), Roi (roi). Q protein (Q), 
Shiga-like toxin A (slt-IA) and B (slt-IB) subunits, and putative holin 
protein (S) genes, complete cds 
gil26o^51lgbiARB49753 [266B7S1] 

(View GenBank report,FASTA reporUASN.l report.Graphical view t l MEDLINE 
link, 20 protein links, or 30 nucleotide neighbors ) 

U37314 

Bacteriophage lambda Rzl protein precursor (Rzl) gene, complete cds 
gill017780lg5u37314IBLU37314 [1017780] 

(View GenBank reportJFASTA reportASN.l report,Graphical view,2 MEDLINE 
links, I protein link, or 9 nucleotide neighbors ) 

U00005 

E coli hflA locus encoding the hflX, hflK and hflC genes, hfq gene, 
complete cds; miaA gene, partial cds 
gil436153lgbIU00005IECOHFLA [436153] 

(View GenBank reportfvASTA reportASN.l report,Graphical view,4 MEDLINE 
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links, 5 protein links, or 8 nucleotide neighbor^ 



U32222 

Bacteriophage 186, complete sequence 
gil3337249igb!U32222IBlU32222 [3337249] 

(View GenBank report,FASTA repor^ASN.l report,Graphical view,6 MEDLINE 
links, 46 protein links, or 5 nucleotide neighbors ) 



AF064539 

Bacteriophage N15 P complete genome 
gtB192683lgb!AF064539IAF064539 [3192683] 

(View GenBank report.FASTA report,ASN.l report, Graphical view,2 MEDLINE 
links, 60 protein links. 26 nucleotide neighbors, or 1 genome link ) 



AF063097 

Bacteriophage P2, complete genome 

gil3 1390861 gWAFO63097!AF063O97 (3139086] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,21 MEDLINE 
links, 42 protein links, 3 nucleotide neighbors, or 1 genome link ) 



Z97974 

Bacteriophage phiadh lys, hoi, intG, rad,and tec genes 
gi!27079501emWZ97974IBPHIADH [2707950] 

(View GenBank reportf-ASTA reportASRl report,Graphical view,2 MEDLINE 
links, 9 protein links, or 1 nucleotide neighbor ) 



AF059243 

Bacteriophage NL95, complete genome 
giB08854SgWAF059243lAFQ59243 P088S45] 

(View GenBank report,FASTA reportASN.l report,Graphical view,2 MEDLINE 
links, 4 protein links, 3 nucleotide neighbors, or 1 genome link ) 



AF052431 

Bacteriophage Ml I A-protein, coat protein, A 1 -protein, and replicase 
genes, complete cds 
fii!2981208lgblAI=052431l [29812G8] 

(View GenBank reportJASTA report^ASN.l report,Graphical view,2 MEDLINE 
links, 4 protein links, or 8 nucleotide neighbors ) 



Y07739 

Staphylococcus phage Twort holTW, plyTW genes 
gil2764979iemWY07739IBPTWGHOLG [2764979] 
(View GenBank reportf-ASTA report^SN.l repoitGraphical view, or 2 
protein links ) 
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Bacteriophage L cro, 24, c2 # and cl genes 
gill4692i3lemblX94331IBLCR024C [1469213] 

(View GenBank report,FASTA reportASN.I report, Graphical view, 1 MEDLINE 
link, or 4 protein links ) 



X78410 

Bacteriophage phiadh holtn and lysin genes 
gir793848lemblX784iaLGHOLLYS (793848] 

(View GenBank report,FASTA reportASN.I report,GraphicaUiew,l MEDLINE 
link, 2 protein links, or i nucleotide neighbor ) 



X99260 

Bacteriophage B103 genomic sequence 
gill429229lemblX9926QIBBlG3G [1429229] 

(View GenBank report,FASTA reportASN.I report,Graphical view.l MEDLINE 
link, 17 protein links, or 12 nucleotide neighbors ) 



AJ000741 

Bacteriophage PI darA operon 
gil2462938Jemb!AJ000741IBPAJ7641 12462938] 

(View GenBank report,FASTA reportASN.I report,Graphical view,l MEDLINE 
link, 10 protein links, or 31 nucleotide neighbors ) 



X87420 

Bacteriophage ES18 genes 24, c2, cro. cl, 18, and oL and oR operators 
gilli43407lemblX8742ClBPES18GEN [1143407] 

(View GenBank repor^FASTA reportASN.l report,Graphical view,5 protein 
links, or 9 nucleotide neighbors ) 



135561 

Bacteriophage phi-105 ORFs 1-3 
gi!532218lgblU35561IPH50RFHTR [532218] 

(View GenBank reportj-ASTA reportASN.l report,Graphical view,l MEDLINE 
link, or 3 protein links ) 



D1CQ27 

Group II RNA coliphage GA genome 
gil217784ldbjlDIG027IPGAXX {217784] 

(View GenBank report,FASTA reportASN.l report,Graphical view, I MEDLINE 
link, 3 protein links, 5 nucleotide neighbors, or 1 genome link ) 



V0U28 

Bacteriophage phi-X174 (cs70 mutation) complete genome 
gill5535lcmblV01128IPfflX174 [15535] 

(View GenBank reportJFASTA reportASN.l report,Graphical view,4 MEDLINE 
links, 11 protein links, or 26 nucleotide neighbors ) 
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S81763 

coat gene...replicase gene [bacteriophage KU1, host=Escherichia coli, 
group II RNA phage. Genomic RNA, 3 genes, 120 ntj 
gil 1438766!gWS81763IS81763 [1438766] 

(View GenBank report.FASTA report,ASN.l report.Graphical view, or 1 
MEDLINE link ) 



U38906 

Bacteriophage rlt rntegrase, repressor protein (rro), dUTPase, lolin and 
l>*sin genes, complete cds 
giil3535171gblU38906IBRU38906 [1353517] 

(View GenBank report,FASTA report^SN.l report,GraphicaJ view,2 MEDLINE 
links, 50 protein links, or 3 nucleotide neighbors ) 



X91149 

Bacteriophage phi-C31 DNA cos region 
gilllCT7473lembiX91149IAPHIC31C [1107473] 

(View GenBank reportJFASTA report^SN.l report,Graphical view.l MEDLINE 
link* 6 protein links, or 1 nucleotide neighbor) 



V00642 

phage MS2 genome 
giI15081lemWV00642ILEMS2X [15081] 

(View GenBank reponJASTA repoiWSN.l reporUGraphical view,8 MEDLINE 
links, 4 protein links, or 20 nucleotide neighbors ) 



V0U46 

Genome of bacteriophage T7 
gif431187lemWV01146!T7CG [431187] 

(View GenBank reportJASTA reportASN.l report,GraphicaI view,13 MEDLINE 
links, 60 protein links, 105 nucleotide neighbors, or 1 genome link ) 



X78401 

Bacteriophage P22 right operon, orf 48, replication genes 18 and 12, nin 
region genes, ninG phosphatase, late contra gene 23, orf 60, complete 
cds, late control region, start of lysis gene 13 
gil512343lemblX78401IPOP22NIN [512343] 

(View GenBank reportJASTA reportASN.l report,Graphical view,2 MEDLINE 
links, 13 protein links, or 4 nucleotide neighbors ) 



Y00408 

Bacteriophage T4 gene t for lysis protein 
gi!15368iemWY004(^MYT4T [15368] 

(View GenBank report,FASTA report ,ASN.l report,Graphical view,l MEDLINE 
link, 1 protein link, or 3 nucleotide neighbors ) 
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Bacteriophage mv4 lysA and lysB genes 
gi!410500IefnblZ26590!MV4LYSAB [410500] 

(View GenBank report,FASTA report,ASN.l report,Graphtcal view, or 4 
protein links ) 



X07809 

Phage phiX174 Ivsis (E) gene upstream region 
gili5094lemb!X07809!MIPHlXE [15094] 

(View GenBank report,FASTA report^ASN.t rcport.Graphical view,l MEDLINE 
link. 2 protein links, or 4 nucleotide neighbors ) 



Z34528 

Lactococcal bacteriophage c2 iysin gene 
gil506455lemblZ34528ILBC2LYSIN [506455] 

(View GenBank repor^FASTA reporiASN.l repon.Graphical view.l MEDLINE 
link. 1 protein link, or 4 nucleotide neighbors ) 



X15031 

Bacteriophage fr RNA genome 
gill5071lembIX15031!LEBFRX [15071] 

(View GenBank report.FASTA reporiASN.l report,Graphical view.l MEDLINE 
link, 4 protein links, 9 nucleotide neighbors, or 1 genome link ) 



X80191 

Bacteriophage PP7 mRNA for maturation, coat, lysis and replicase 
proteins 

gi!5172371emblX80191IBPP7PR [517237] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,l MEDLINE 
link, 4 protein links, or 1 genome link ) 



X85010 

Bacteriophage A51 1 ply51 1 gene 
gil853748lemb!X85010iBPA511PLY [853748] 

(View GenBank report JASTA report^SN.l report,Graphical view,l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 



X85009 

Bacteriophage A 500 holSCO and ply 500 genes 
gil853744lemWX85009IBPA500PLY [8S3744] 

(View GenBank reportJFASTA report^ASN.l report.Graphical view,l MEDLINE 
link, 3 protein links, or 4 nucleotide neighbors ) 



X85008 

Bacteriophage A 1 18 holl 18 and plyl 18 genes 
giI853740)emblX850081BPAl 18PLY [853740] 

(View GenBank report,FASTA report^SN.l report,Graphica] view,l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 
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Bacteriophage phi-XI74 genes for lysis protein and beta-lactamase 
gi!520996lemblZ35638IBPLYSPR [520996] 

(View GenBank report.FASTA report.ASN.1 reportGraphical view.l MEDLINE 
link, 2 protein links, or 516 nucleotide neighbors ) 



J02459 

Bacteriophage lambda, complete genome 
gil2l5104lgblJ02459ILAMCG [215104] 

(View GenBank reportFASTA reportASN.l reportGraphical view.87 MEDLINE 
links, 67 protein links, 190 nucleotide neighbors, or 1 genome link ) 



X87674 

Bacteriophage PI lydA & lydB genes 
gil974763lemblX87674IBACPlLYD [974763] 

(View GenBank reportFASTA reportASN.l reportGraphical view, 1 MEDLINE 
link, 2 protein links, or 2 nucleotide neighbors ) 



X87673 

Bacteriophage PI gene 17 
gil974761lemblX87673IBACP117 [974761] 

(View GenBank reportFASTA reportASN.l report,Graphical view.l MEDLINE 
link, 1 protein link, or 1 nucleotide neighbor ) 



M14784 

Bacteriophage T3 strain amNG220B right end, tail fiber protein, lysis 
protein and DNA packaging proteins, complete cds 
gil21581QlgblM14784IPr3RE [215810] 

(View GenBank report, FASTA reportASNl reportGraphical view.l MEDLINE 
link, 9 protein links, or 10 nucleotide neighbors ) 



Ml 1813 

Bacteriophage PZA (from B.subtilis), complete genome 
gil216046lgblMl 1813IPZACG [216046] 

(View GenBank reportFASTA reportASN.l reportGraphicai view3 MEDLINE 
links, 27 protein links, 17 nucleotide neighbors, or 1 genome link ) 



M16812 

Bacteriophage K3 T lysis gene, complete cds 
giI2155Q3lgWM16812IPK3LYST [215503] 

(View GenBank reportFASTA reportASN.l reportGraphical view,! MEDLINE 
link, 1 protein link, or 4 nucleotide neighbors ) 



J04356 

Bacteriophage P22 proteins 15 (complete cds), and 19 (3' end) genes 
gil215265!gbU04356IP2215P [215265] 
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(View GcnBank reportFASTA reportASN.l reportGraphical view, I MEDLINE 
link, 3 protein links, or 2 nucleotide neighbors ) 



J04343 

Bacteriophage JP34 coat and lysis protein genes, complete cds, and 
replicase protein gene, S end 
gi!215076lgblJ04343IJP3COLY [215076] 

(View GenBank report.FASTA reportASN.l reportGraphical view.l MEDLINE 
link, 3 protein links, or 2 nucleotide neighbors ) 



J02482 

Bacteriophage phi -XI 74, complete genome 
gil2160l9Igb!J02482iPXlCG {216019] 

(View GenBank reportFASTA reportASN.l reportGraphical view,23 MEDLINE 
links, il protein links, 26 nucleotide neighbors, or 1 genome link ) 



M99441 

Bacteriophage T4 anti-sigma 70 protein (asiA) gene, complete cds and 
lysis protein, 3* end 

gil21^20lgb!M99441IPT4ASIA [215820] 

(View GenBank reportFASTA reportASN.l reportGraphical view.3 MEDLINE 
links, 2 protein links, or 2 nucleotide neighbors ) 



M6S239 

Bacteriophage 21 lysis genes S, R, and Rz, complete cds 
gil215466lgblM65239IPH2LYSGEN [215466] 

(View GenBank reportFASTA reportASN.l reportGraphical view,l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 



M10S37 

Phage G4 D/E overlapping gene system, encoding D (morphogenetic) and E 
(lysis) proteins 

gi!2154271gblM10637lPG4DE [215427] 

(View GenBank reportFASTA reportASN.l reportGraphical view,l MEDLINE 
link, 2 protein links, or 12 nucleotide neighbors ) 



J02454 

Bacteriophage G4, complete genome 
gil215415lgbU02454IPG4CG [215415] 

(View GenBank reportFASTA reportASN.l reportGraphical view,6 MEDLINE 
links, 1 1 protein links, 20 nucleotide neighbors, or 1 genome link ) 



J02580 

Bacteriophage PA-2 (Exoli porcine strain isolate) Rz gene, 5'end; ORF2, 
outer membrane porin protein flc) and ORF1 genes, complete cds 
gi!215366IgbU025801PA2LC [215366] 

(View GenBank reportFASTA reportASN.l reportGraphical view.l MEDLINE 
link, 4 protein links, or 4 nucleotide neighbors ) 
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MI4782 

Bacillus phage phi-29 head morphogenesis, major head protein, head fiber 

protein, tail protein, upper collar protein, lower collar protein, 

pre-neck appendage protein, morphogenesis(13), lysis, morphogenesis(15), 

encapstdation genes, complete cds 

gil215323lgblM14782IP29LATE2 [215323] 

(View GenBank report,FASTA report.ASN.1 report,Graphical view, I MEDLINE 
link, 11 protein links, or 1 1 nucleotide neighbors ) 



M 10997 

Bacteriophage P22 lysis genes 13 and 19, complete cds 
gil215262lgblM10997IP221319 (215262] 

(View GenBank report.FASTA report^ASN.l report.GraphicaJ view.l MEDLINE 
link, 2 protein links, or 3 nucleotide neighbors ) 



J02467 

Bacteriophage MS2, complete genome 
gi!215232lgblJ024671MS2CG [215232] 

(View GenBank reportFASTA report ASN.l reporuGraphical view,8 MEDLINE 
links, 4 protein links, 20 nucleotide neighbors, or 1 genome link ) 



M14035 

Bacteriophage lambda lysis S gene with mutations leading to nonlethaiity 
of S in the plasmid pRGl 
gil215180lgWM14035ILAMLYS [215180] 

(View GenBank report,FASTA reporUASN.l report,Graphical view.l MEDLINE 
link, 1 protein link, or 14 nucleotide neighbors ) 



U04309 

Bacteriophage phi-LC3 putative holin (lysA) gene and putative murein 
hydrolase flysB) gene, complete cds 
gi!530796lgblU04309IBPU04309 [530796] 

(View GenBank reportJASTA reporuASN.l report,Graphical view.l MEDLINE 
link. 2 protein links, or 1 nucleotide neighbor ) 
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Table 13 



NCBI Entrez Nucleotide QUERY 

Key words holin 

51 citations found (all selected) 

AF034975 

Bacteriophage H-19B essential recombination function protein (erf), kil 
protein (kil), regulatory protein cIII (cIII), protein gpl7 (17), N 
protein (N), cl protein (cl), cro protein (cro), ell protein (ell), 0 
protein (O), P protein (P), ren protein (ren), Roi (roi), Q protein (Q), 
Shiga-like toxin A (slt-IA) and B (slt-IB) subunits, and putative holin 
protein (S) genes, complete cds 
gil2668751lgblAF034975l [2668751] 

(View GenBank reportJFASTA report ASN.l report.Graphical view,l MEDLINE 
link, 20 protein links, or 30 nucleotide neighbors ) 

U52961 

Staphylococcus aureus holin-like protein LrgA (lrgA) and LrgB (IrgB) 
genes, complete cds 

gill841516lgblU52961ISAU5296I [1841516] 

(View GenBank report,FASTA report^\SN.l report,Graphical view,l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 



U28154 

Haemophilus somnus cryptic prophage genes, capsid scaffolding protein 
gene, partial cds, major capsid protein precursor, endonuclease, capsid 
completion protein, tail synthesis proteins, holin, and lysozyme genes, 
complete cds 

gill765928lgblU28154IHSU28154 [1765928] 

(View GenBank reportJFASTA report.ASN.1 report,Graphical view,l MEDLINE 
link, or 13 protein links ) 



AF032122 

Streptococcus thermophilus bacteriophage Sfil9 central region of genome 
gil2935682lgblAF0321221 [2935682] 

(View GenBank reportJFASTA report,ASN-l report,Graphical view,l MEDLINE 
link, 14 protein links, or 2 nucleotide neighbors ) 

AF032121 

Streptococcus thermophilus bacteriophage Sfi21 central region of genome 
g il2935667lgblAF032121lAF032121 [2935667] 

(View GenBank report JASTA report ASN.l report,Graphical view,l MEDLINE- 
link, 14 protein links, or 2 nucleotide neighbors ) 



WO 00/32825 



PCT/IB99/02040 



255 

AF021803 

Bacillus subtilis 168 prophage SPbeta N-acetylmuramoyl-L-alanine amidase 
(blyA), holin-like protein (bhlA), holin-like protein (bhJB), and yoIK 
genes, complete cds; and yolJ gene, partial cds 
gil2997594igblAF0218031AF021803 [2997594] 

(View GenBank report,FASTA report,ASN.l report,GraphicaI view,l MEDLINE 
link, 5 protein links, or 1 nucleotide neighbor ) 



AF057033 

Streptococcus thermophilus bacteriophage sfill gp502 (orf502), gp284 
(orf284), gpl29 (orf 129), gpl93 (orfl93), gpl 19 (orf 119), gp348 
(orf348), g P 53 (orf53), gpl 13 (orfll3), gpl04 (orf 104), gpll4(orfll4), 
gpl28 (orf 128), gpl68 (orf 168), gpl 17 (orf 117), gpl05 (orf 105), putative 
minor tail protein (orf 1510), putative minor structural protein 
(orf512), putative minor structural protein (orflOOO), gp373 (orf373), 
gp57 (orf57), putative anti-receptor (orf695), putative minor structural 
protein (orf669), gpl49 (orf 149), putative holin (orf 141), putative 
holin (orf87), and lysin (orf288) genes, complete cds 
gil3320432lgblAF057033IAF057033 [3320432] 

(View GenBank report,FASTA report ,ASN.l report,Graphical view,25 protein 
links, or 1 nucleotide neighbor ) 



U32222 

Bacteriophage 186, complete sequence 
g il3337249igblU32222IBlU32222 [3337249] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,6 MEDLINE 
links, 46 protein links, or 5 nucleotide neighbors ) 



AB009866 

Bacteriophage phi PVL proviral DNA, complete sequence 
gil3341907ldbjlAB009866IAB009866 [3341907] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,63 protein 
links, or 1 nucleotide neighbor ) 



AF009630 

Bacteriophage WL170, complete genome 
gil32822601gb!AF009630IAF009630 [3282260] 

(View GenBank report,FASTA report ASN.l report,Graphical view,63 protein 
links, 3 nucleotide neighbors, or 1 genome link ) 



AF064539 

Bacteriophage N15, complete genome 
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gil3192683lgblAF0645391AF064539 [3192683] 

(View GenBank report,FASTA report^SN.l report,Graphical view,2 MEDLINE 
links, 60 protein links, 26 nucleotide neighbors, or i genome link ) 



AF063097 

Bacteriophage P2, complete genome 

giB 139086lgblAF063097IAF063097 {3 139086] 

(View GenBank report,FASTA report,ASN.l report'Graphical view^l MEDLINE 
links, 42 protein links, 3 nucleotide neighbors, or 1 genome link) 



Z97974 

Bacteriophage phiadh lys, hoi, intG, rad,and tec genes 
gil2707950lemblZ97974IBPHIADH [2707950] 

(View GenBank report,FASTA report ASN.l report.Graphical view,2 MEDLINE 
links, 9 protein links, or 1 nucleotide neighbor ) 



X95646 

Streptococcus thermophilus bacteriophage Sfi21 DNA; lysogeny module, 
8141 bp 

gil2292747Iemb!X95646IBSFI21LYS [2292747] 

(View GenBank reportJFASTA report,ASN.l report.Graphical view,2 MEDLINE 
links, 19 protein links, or 3 nucleotide neighbors ) 



S EG JLLHL YS INO 

Bacteriophage LL-H structural protein gene, partial cds; minor 
structural protein gp61 (g57), unknown protein, unknown protein, 
structural protein (g20), unknown protein, unknown protein, major capsid 
protein (g34), main tail protein gpl9 (gl7), holin (hoi), muramidase 
(mur), unknown protein, unknown protein, unknown protein, unknown 
protein, unknown protein, and unknown protein genes, complete cds; 
unknown protein gene, partial cds; and unknown protein, unknown protein, 
unknown protein, unknown protein, unknown protein, minor structural 
protein gp75 (g70), minor structural protein gp89 (g88), minor 
structural protein gp58 (g71), unknown protein, unknown protein, unknown 
protein, and unknown protein genes, complete cds 
gil 10G4337IgblISEG_LLHLYSIN0 [1004337] 

(View GenBank report J=ASTA report^SN.l report,Graphical view,4 MEDLINE 
links, 3 1 protein links, or 1 nucleotide neighbor ) 



M96254 

Bacteriophage LL-H holin (hoi), muramidase (mur), and unknown protein 
genes, complete cds 

gil!004336lgblM96254ILLHLYSINQ3 [1004336] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 
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Y07740 

Staphylococcus phage 187 ply 187 and holl87 genes 
gil2764982lemblY07740IBP187PLYH [2764982] 
(View GenBank report,FASTA report ASN.l report,Graphical view, or 2 
protein links ) 



U88974 

Streptococcus thermophilus bacteriophage 01205 DNA sequence 
gil2444080lgblU88974l [2444080] 

(View GenBank report,FASTA report,ASN.l report.Graphical view,l MEDLINE 
link, 57 protein links, or 6 nucleotide neighbors ) 



Z99117 

Bacillus subtilis complete genome (section 14 of 21): from 2599451 to 
2812870 

gil2634966lemblZ991 171BSUB0014 [2634966] 

(View GenBank report,FASTA report ASN.l report,Graphical view,233 

protein links, 51 nucleotide neighbors, or 1 genome link ) 



Z99115 

Bacillus subtilis complete genome (section 12 of 21): from 2195541 to 
2409220 

gil2634478lemblZ99115iBSUB0012 [2634478] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,244 
protein links, 64 nucleotide neighbors, or 1 genome link ) 



Z99110 

Bacillus subtilis complete genome (section 7 of 21): from 1194391 to 
1411140 

gi!2633472lemblZ99110IBSUB0007 [2633472] 

(View GenBank reportJFASTA report^ASN.l report,Graphical view,226 
protein links, 3 1 nucleotide neighbors, or 1 genome link ) 



X78410 

Bacteriophage phiadh holin and iysin genes 
gil793848lemblX784101LGHOLLYS [793848] 

(View GenBank reportJASTA report,ASN.l report,Graphical view,l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 



Z93946 
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Bacteriophage Dp- 1 dph and pal genes and 5 open reading frames 

gil 1934760lemblZ93946IBPDP10RFS [ 1934760] 

(View GenBank report,FASTA report ASN.l report,Graphical view, or 6 

protein links ) 



AF011378 

Bacteriophage ski complete genome 
gil2392824lgblAF01 1378IAF01 1378 [2392824] 

(View GenBank reportJFASTA report ASN.l report,Graphical view ,54 protein 
links, 2 nucleotide neighbors, or 1 genome link ) 



Z47794 

Bacteriophage Cp-1 DNA, complete genome 
gi!2288892lemblZ47794IBPCPlXX [2288892] 

(View GenBank report,FASTA report,ASN.l report.Graphical view3 MEDLINE 
links, 28 protein links, 1 nucleotide neighbor, or 1 genome link ) 



L35561 

Bacteriophage phi-105ORFs 1-3 
gil532218IgblL35561IPH50RFHTR [532218] 

(View GenBank report,FASTA report ASN.l report,Graphical view,l MEDLINE 
link, or 3 protein links ) 



D49712 

Bacillus licheniformis DNA for ORFs, xpaL2 homologous protein and xpaLl 
homologous protein, complete and partial cds 
gill514423ldbj!D49712ID49712 [1514423] 

(View GenBank reportJFASTA report^ASN.l report.Graphical view,2 MEDLINE 
links, or 4 protein links ) 



X90511 

Lactobacillus bacteriophage phigle DNA for Rorf 162, Holin, Lysin, and 
Rorf 175 genes 

gill926386lemblX90511ILBPHIHOL [1926386] 

(View GenBank report,FASTA report ASN.l report,Graphical view ,4 protein 
links, or 1 nucleotide neighbor ) 



X98106 

Lactobacillus bacteriophage phigle complete genomic DNA 
gi!1926320lemblX98106ILBPHIGlE [1926320] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE 



WO 00/32825 



PCT/IB99/02040 



259 

link, 50 protein links, or 4 nucleotide neighbors ) 



U72397 

Bacteriophage 80 alpha holin and amidase genes, complete cds 
gill763241!gblU72397IB8U72397 [1763241] 

(View GenBank report,FASTA report ASN.i report,Graphical view,2 protein 
links, or 2 nucleotide neighbors ) 



U38906 

Bacteriophage rlt integrase, repressor protein (rro), dUTPase, holin and 
Iysin genes, complete cds 
gill353517lgbtU38906IBRU38906 [1353517] 

(View GenBank report JFASTA report ASN.l report,Graphical view,2 MEDLINE 
links, 50 protein links, or 3 nucleotide neighbors ) 



X91149 

Bacteriophage phi-C3 1 DNA cos region 
gilll07473lemblX91149IAPHIC31C [1107473] 

(View GenBank report^FASTA reportASN.l report,Graphical view.l MEDLINE 
link, 6 protein links, or 1 nucleotide neighbor ) 



U24159 

Bacteriophage HP1 strain HPlcl, complete genome 
gill046235igblU24159IBHU24159 [1046235] 

(View GenBank report,FASTA report ASN.l reporttGraphical view,6 MEDLINE 
links, 41 protein links, 8 nucleotide neighbors, or 1 genome link ) 



Z26590 

Bacteriophage mv4 IysA and lysB genes 
giW10500lemblZ26590IMV4LYSAB [410500] 

(View GenBank report,FASTA report ASN.l report,Graphical view, or 4 
protein links ) 



Z70177 

B^ubtilis DNA (28 kb PBSX/skin element region) 
gill225934lerablZ70177IBSPBSXSE [1225934] 

(View GenBank report,FASTA report^SN.l report,Graphical view.32 protein 
links, or 4 nucleotide neighbors ) 



Z36941 
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B.subtilis defective prophage PBSX xhlA, xhlB* and xylA genes 
gil535793lemblZ36941IBSPBSXXHL [535793] 

(View GenBank report,FASTA report,ASN.l report,Graphical view ,4 protein 
links, or 5 nucleotide neighbors ) 



X89234 

L.innocua DNA for phagelysin and holin gene 
gil 1 l34844lemb!X89234ILICPLYHOL [ 1 134844] 

(View GenBank report.FASTA report,ASN.l report.Graphical view,l MEDLINE 
link, 2 protein links, or 4 nucleotide neighbors ) 



X85010 

Bacteriophage A511 plySll gene 
gi!853748lemblX85010iBPA511PLY [853748] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 



X85009 

Bacteriophage A500 holSOO and plySOO genes 
gil853744lemblX85009IBPA500PLY [853744] 

(View GenBank report,FASTA report^ASN.l report,Graphical view,l MEDLINE 
link, 3 protein links, or 4 nucleotide neighbors ) 



X85008 

Bacteriophage A 1 18 hoi 1 18 and ply 1 18 genes 
gil8537401emblX85008IBPA118PLY [853740] 

(View GenBank report JASTA report ,ASN.l report,Graphical view.l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 



L34781 

Bacteriophage phi 1 1 holin homologue (ORF3) gene, complete cds and 
peptidoglycan hydrolase (lytA) gene, partial cds 
gil51 1838lgblL347811BPHHOUN [511838] 

(View GenBank reportJASTA repoit,ASN.l report,Graphical view,l MEDLINE 
link, 4 protein links, or 2 nucleotide neighbors ) 



U11698 

Serratia marcescens SM6 extracellular secretory protein (nucE), putative 
phage lysozyme (nucD), and transcriptional activator (nucQ genes, 
complete cds 

gil509550lgblU11698ISMU11698 [509550] 

(View GenBank report ,FASTA reportASN.l report,Graphical view,l MEDLINE 
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link, 3 protein links, or 1 nucleotide neighbor ) 
U31763 

Seiratia marcescens phage-holin analog protein (regA), putative phage 
lysozyme (regB), and transcriptional activator (regQ genes, complete 
cds 

gil965068lgblU3 1763ISMU3 1763 [965068J 

(View GenBank report JASTA report^SN.l report,Graphical view,l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 

X87674 

Bacteriophage PI lydA & lydB genes 
gil9747631emblX87674IBACPlLYD [974763J 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE 
link, 2 protein links, or 2 nucleotide neighbors ) 

L48605 

Bacteriophage c2 complete genome 
gilll46276!gblL48605IC2PVCG [1146276] 

(View GenBank report,FASTA report ASN.l report,Graphical view3 MEDLINE 
links, 39 protein links, 3 nucleotide neighbors, or 1 genome link ) 

L33769 

Bacteriophage bIL67 DNA polymerase subunit (ORF3-S), essential 
recombination protein (ORF13), lysin (ORF24), minor tail protein 
(ORF31), terminase subunit (ORE32), holin (ORF37), unknown protein (ORF 
1-2,6-12,14-23,25-3033-36), complete genome 
gii522252lgbiL33769IL67CG [522252] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,l MEDLINE 
link, 37 proteia links, 2 nucleotide neighbors, or 1 genome link ) 

L31348 

Bacteriophage Tuc2009 integrase (int) gene, complete cds; lysin (lys) 
gene, 3' end 

gil508612lgblL31348ITU2INT [508612] 

(View GenBank report JASTA report,ASN.l report,Graphical view,2 MEDLINE 
links, 3 protein links, or 3 nucleotide neighbors ) 

L31364 

Bacteriophage Tuc2009 holin (S) gene, complete cds; lysin (lys) gene, 

complete cds * — 

gi!496281lgblL3 1364TTU2SLYS [496281] 



00/32825 



PCT/IB99/02040 



262 



(View GenBank report,FASTA reportASN.i report,Graphical view,l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 



L31366 

Bacteriophage Tuc2009 structural protein (mp2) gene, complete cds 
gil496278lgblL31366ITU2MP2A [496278] 

(View GenBank report,FASTA report,ASN.l reporj,Grapliical view,l MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 



L31365 

Bacteriophage Tuc2009 structural protein (mpl) gene, complete cds 
gi!496276lgblL3 1365ITU2MP1 A [496276] 

(View GenBank report,FASTA report,ASN.l report.Graphical view,l MEDLINE 
link, or 1 protein link ) 



U04309 

Bacteriophage phi-LC3 putative holin (lysA) gene and putative murein 
hydrolase (lysB) gene, complete cds 
gil530796lgblU04309IBPU04309 [530796] 

(View GenBank report,FASTA report^SN.l report,Graphical view,! MEDLINE 
link, 2 protein links, or 1 nucleotide neighbor ) 
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Table 14 



NCBI Entrez Nncleotide QUERY 
Key word: bacteriophage and kil 
5 citations found (all selected) 



AF034975 

Bacteriophage H-19B essential recombination function protein (erf), kil 
protein (kil), regulatory protein cIII (cIII), protein gpl7 (17), N 
protein (N), cl protein (cl), cro protein (cro), ell protein (ell), O 
protein (0), P protein (P), ren protein (ren), Roi (roi), Q protein (Q), 
Shiga-like toxin A (slt-IA) and B (slt-IB) subunits, and putative holin 
protein (S) genes, complete cds 
gil2668751lgblAF034975I [2668751] 

(View GenBank report,FASTA reportASN.l report,Graphical view,l MEDUNE 
link, 20 protein links, or 30 nucleotide neighbors ) 

X15637 

Bacteriophage P22 P(L) operon encompassing ral, 17, kil and arf genes 
gil 15646lemblX 15637IPOP22PL [ 15646] 

(View GenBank report,FASTA report^SN.l report,Graphical view,l MEDUNE 
link, 7 protein links, or 2 nucleotide neighbors ) 

J02459 

Bacteriophage lambda, complete genome 
gi!2151041gblJ02459ILAMCG [215104] 

(View GenBank report,FASTA report,ASN.l report,Graphical view,87 MEDUNE 
links, 67 protein links, 190 nucleotide neighbors, or 1 genome link ) 

M64C97 

Bacteriophage Mu left end 
gil215543lgblM640971PMULEFTEN [215543] 

(View GenBank report^ASTA report ASN.l report,Graphical view,2 MEDUNE 
links, 39 protein links, or 15 nucleotide neighbors ) 

M18902 

Bacteriophage D1(B kil gene encoding a replication protein, 3' end; and 
containing three ORFs, complete cds 
gil 166191 lgblM18902ID18KIL [166191] 

(View GenBank report JASTA report r ASN.l report,Graphical view,l MEDUNE 
link, 1 protein link, or 3 nucleotide neighbors ) 
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Table 16 

Phage 44AHJD complete genome sequence. 16668 nucleotides. 



1 tccatttctt tactaaactt aaaaatgctg tgcaacaact taaccaactt atctaaccta ttacatattc 

71 atcaaataca aaatttatgt atctattgac ttttattcaa aattatgatt tcaacatata ataaaattaa 

141 tt tact tat t taaatattct atgatataat tagttataaa atatttggag gtgtataaat gacagaattt 

211 gatgaaatcg taaaaccaga cgacaaagaa gaaacttcag aatcaactga agaaaattta gaatcaactg 

281 aagaaacttc agaatcaact gaagaatcaa ctgaagaatc aactgaagaa tcaactgaag ataaaacagt 

351 agaaacaatc gaagaagaaa atgaaaacaa attagaacct actacaacag atgaagatag ttcgaaattt 

421 gaccctgttg tattagaaca acgtattgct tcattagaac aacaagtgac tactttttta tcttcacaaa 

491 tgcaacaacc acaacaagta caacaaacac aatcagatgt aacagaatca aacaaagaag ataacgacta 

561 ttcagatgaa gaactagttg ataagttaga tttagattag gaggaattta aacatgtatg agggaaacaa 

631 catgcgttct atgatgggta catcatatga agattcaaga ttaaataaac gaacagaatt aaatgaaaac 

701 atgtcaattg atacaaataa aagtgaagat agttatggtg tacaaattca ttcactttca aaacaatcat 

771 ttacaggtga cgttgaggag gaataataaa ttatggcaca acaatctaca aaaaatgaaa ctgcactttt 

841 agtagcaaag tcagctaaat cagcgttaca agattttaat catgattatt caaaatcttg gacatttggc 

911 gacaaatggg ataattcaaa tacaatgttc gaaacatttg taaataaata tttattccct aagattaatg 
981 agactttatt aatcgatatt gcattaggta atcgttttaa ttggttagct aaagagcaag attttattgg 

1051 acaatatagt gaagaatacg tgattatgga cacagtacca attaacatgg acttatctaa aaatgaggaa 

1121 ttaatgttga aacgtaatta tccacgtatg gcaactaagt tatatggtaa cggaattgtg aagaaacaaa 

1191 aattcacatt aaacaacaat gatacacgtt tcaatttcca aacattagca gacgcaacta attacgcttt 

1261 aggtgtatac aaaaagaaaa tttctgatat taatgtatta gaagaaaaag aaatgcgtgc aatgttagtt 

1331 gattactcat tgaatcaatt atccgaaaca aatgtacgta aagcaacatc aaaagaagat ttagcaagca 

1401 aagtttttga agcaatccta aacttacaaa acaacagtgc taaatataat gaagtacatc gtgcatcagg 

1471 tggtgcaatt ggacaatata caactgtatc aaaattaaaa gatattgtga ttttaacaac agattcatta 

1541 aaatcttatc ttttagatac taagattgca aacacattcc agattgcagg cattgatttc acagatcacg 

1611 ttattagttt tgacgactta ggtggcgtgt ttaaagtaac aaaagaattt aagttacaaa accaagattc 

1681 aattgacttt ttacgtgcgt atggagatta tcaatcacaa ttaggagata caattccagt tggtgctgta 

1751 tttacttatg atgtatctaa acttaaagag tttactggca acgttgaaga aattaaacca aaatcagatt 

1821 tatatgcgtt tattttggat attaattcaa ttaaatataa acgttacaca aaaggtatgt taaaaccacc 

1891 attccataac cctgaatttg atgaagttac acactggatt cattactatt catttaaagc cattagtcca 

1961 ttctttaata aaattttaat tactgaccaa gatgtaaatc caaaaccaga ggaagaatta caagaataaa 

2031 aggagcgtaa aatatgaaca acgataaaag aggtttaaac gttgagttat caaaggaaat cagcaaaaga 

2101 gttgttgaac atcgcaacag atttaaacgt cttatgttta atcgttattt ggaattttta ccgctactaa 

2171 tcaactatac caatcgtgat acggttggta tagattttat tcagttagaa tcagctttaa gacaaaacat 

2241 taatgtagtt gttggtgaag ctagaaataa gcaaattatg attcttggtt atgtaaataa cacttacttt 

2311 aatcaagcac caaatttttc atcaaacttt aatttccaat ttcaaaaacg attaactaaa gaagatatat 

2381 attttattgt acctgactat ttaatacctg atgattgtct acaaattcat aagctatatg ataactgtat 

2451 gagtggtaac tttgttgtca tgcaaaataa accaattcaa tataatagtg atatagaaat tatagaacat 

2521 tatactgatg aattagcaga agttgcttta tctcgctttt ctttaatcat gcaagcaaaa tttagcaaga 

2591 tatttaaatc agaaattaat gacgagtcaa tcaatcaact tgtgtccgaa atatataacg gtgcaccatt 

2661 tgttaaaatg tcacctatgt ttaatgcaga tgacgatatc attgatttaa caagtaatag cgtaatccca 

2731 gcattaactg aaatgaaacg ggaatatcaa aacaaaatta gtgaattaag taactattta ggcattaatt 

2801 cattagccgt tgataaagaa agcggtgttt cagacgaaga ggcaaaaagt aatcgtggat ttaccacatc 

2871 aaacagtaat atctatttaa aaggtcgtga accaattacg tttttatcaa agcgttatgg tttagatatt 

2941 aaaccgtatt acgatgatga aacaacgtct aaaatatcaa tggtagacac actttttaaa gatgaaagca 

3011 gtgatataaa tggctagata cacaatgact ttatacgatt tcattaaatc agaattgatt aaaaaaggtt 

3081 tcaatgaatt tgtaaatgat aataaattaa cgttttatga tgatgaattt caattcatgc aaaaaatgct 

3151 gaagttcgac aaagacgttt tagctatcgt taatgaaaaa gtatttaaag gtttttcatt gaaagatgaa 

3221 ttatcagatt tactttttaa aaaatcattt acgattcatt ttttagatag agaaatcaac agacaaacag 

3291 ttgaagcatt tggcatgcaa gtgattactg tatgtattac acatgaggat tatttaaatg tggtttattc 

3361 atcaagtgaa gttgaaaaat acttacaatc acaaggcttc acagaacaca atgaagatac aacaagtaac 

3431 actgatgaaa catcgaatca aaatgctaca tctttagaca attcaactgg catgactgca aacagaaacg 

3501 cttatgtgtc attaccacaa agtgaggtta acattgatgt tgataataca acgttacgat tcgctgataa 

3571 taatacgatt gataacggta aaactgtgaa taaatcgagt aacgaaagta atcaaaacgc aaaacgtaat 

3641 caaaatcaaa aaggtaatgc aaaaggtaca caattcacta agcagtattt aattgataat attgataaag 

3711 cgtacgattt aagaaagaaa attttaaatg aatttgataa aaaatgtttt ttacaaattt ggtagaggtg 

3781 gttaaataat ggcatataat gaaaacgatt ttaaatattt tgatgacatt cgtccatttt tagacgaaat 

3851 ttataaaacg agagaacgtt atacaccgtt ttacgatgat agagcagatt ataatactaa ttcaaaatca 

3921 tattatgatt atatttcaag attatcaaaa ctaattgaag tattagcacg tcgtatttgg gactatgaca 

3991 atgaattaaa aaaacgtttc aaaaattggg acgacttaat gaaagcattt ccagagcaag cgaaagactt 

4061 atttagaggt tggttaaacg acggtacgat tgacagtatt attcatgacg agtttaaaaa atatagcgca 

4131 ggattaacat cggcatttgc tttatttaaa gttactgaaa tgaaacaaat gaatgacttt aaatcagaag 

4201 ttaaagactt aattaaagat attgaccgtt tcgttaatgg gtttgaatta aatgagcttg aaccaaagtt — 

4271 tgtgatgggc tttggtggta ttcgcaacgc agttaaccaa tctattaata ttgataaaga aacaaatcac 

4341 atgtactcta cacaatccga ttctcaaaaa cctgaaggtt tttggataaa taaattaaca cctagtggtg 

4411 acttaatttc aagcatgcgt attgtacagg gtggtcatgg tacaacaatc ggattagaac gtcaatccaa 

4481 tggtgaaatg aaaatctggt tacatcacga tggtgttgca aaactgttac aagtcgcata taaagataat 

45S1 tatgtattag atttagaaga ggctaaaggt ttaacagatt atacaccaca gtcactttta aacaaacaca 

4621 catttacacc gttaattgat gaagcaaatg acaaactcat tttaagattc ggtgacggaa caatacaggt 

4691 tcgttcaaga gcagacgtaa aaaatcacat tgataatgta gaaaaagaaa tgacaattga taattcagaa 
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4761 aacaatgata atcgttggat gcaaggcatt gctgttgatg gtgatgattt atactggtta agtggtaaca 

4831 gttcagttaa ttcacatgtt caaatcggta aatattcatt aacaacaggt caaaagattt atgattatcc 

4901 atttaagtta tcatatcaag acggtattaa tttcccacgt gataacttta aagagcctga gggtatttgc 

4 971 atttatacaa atccaaaaac aaaacgtaaa tcgttattac ttgctatgac aaacggcggt ggtggaaaac 

5041 gtttccataa tttatatggt ttcttccaac ttggtgagta tgaacacttt gaagcattac gcgcaagagg 

5111 ttcacaaaac tataaattaa caaaagacga cggtcgtgca ttatctattc cagaccatat cgacgattta 

5181 aatgacttaa cgcaagctgg tttttattat attgacgggg gtactgcaga aaaacttaag aatatgccaa 

5251 tgaatggtag caagcgtata attgacgctg gttgtttcat taatgtatac cctacaacac aaacattagg 

5321 tacggttcaa gaattaacac gtttctcaac aggtcgtaaa atggttaaaa tggtgcgtgg tatgacttta 

5391 gacgtattta cgttaaaatg ggattatgga ttatggacaa caatcaaaac tgacgcacca tatcaagaat 

5461 atttggaagc aagtcaatac aataactgga ttgcttatgt aacaacagct ggtgagtatt acattacagg 

5531 taaccaaatg gaattattta gagacgcgcc agaagaaact aaaaaagtgg gtgcatggtt acgtgCgtca 

5601 agtggcaacg cagtcggtga agtaagacaa acattagagg ctaatatatc ggaatataaa gaattcttca 

5671 gtaatgttaa tgcggaaaca aaacatcgcg aatatggttg ggcagcaaaa catcaaaaat aggagtgata 

5741 taaatgaaat cacaacaaca agcaaaagaa tggatataca agcatgaggg ggcaggtgtt gactttgatg 

5811 gtgcatatgg atttcaatgt atggacttat cagttgctta tgcgtattac attactgacg gtaaagttcg 

5881 catgtggggt aatgctaaag acgcgataaa taatgacttt aaaggtttag cgacggtgta taaaaataca 

5951 ccgagcttta aacctcaatt aggggacgtt gctgtatata caaatggaca atatggacat attcaatgtg 

6021 tgttaagtgg aaatcttgat tattatacat gcttagaaca aaactggtta ggcggcggtt ttgacggttg 

6091 ggaaaaagca accattagaa cacattatta tgacggtgta acccacttta ttagacctaa attttcaggt 

6161 agtaatagca aagcattaga aacatcaaaa gtaaatacat ttggaaaatg gaaacgaaac caatacggca 

6231 catattatag aaatgaaaat ggtacattta catgtggttt tttaccaata tttgcacgtg tcggtagtcc 

6301 aaaattatca gaacctaatg gctattggtt ccaaccaaac ggttatacac catataacga agtttgttta 

6371 tcagatggtt acgtatggat tggttataac tggcaaggca cacgttatta tttaccagtg cgccaatgga 

6441 atggaaaaac aggtaatagt tacagtgttg gtattccttg gggggtgttc tcataatggg tattttagcc 

6511 tttttctttg aatttagttg gaaaagatac aaataagagg tgtaaacaat ggctgataga atcgtaagaa 

6581 gtttaagaca agttgaaaca attgaacgtt tattggagga aaaaaatgag aaagttaacg aattttaagt 
6651 ttttctataa cacaccgttt acagactatc aaaacacgat tcattttaat agtaataaag aacgtgatga 

6721 ttatttttta aatggtcgtc attttaaatc gttagactat tcaaaacaac cgtataattt tatacgtgat 

6791 agaatggaaa tcaatgttga tatgcagtgg catgacgcac aaggtattaa ctacatgacg tttttatcag 

6861 atcttgagga tagaagatat tacgcttttg taaaccaaat cgaatacgtg aatgacgttg tggttaaaat 

6931 atattttgtc attgatacca ttatgacgta tacacaaggg aatgtattag agcaactctc aaacgtcaat 

7001 attgaacgtc aacatttatc aaaacgcacg tataactata tgttaccaat gttacgtaat aatgatgatg 

7071 tgttaaaagt atcaaataaa aactatgttt ataaccaaat gcaacaatat ttggaaaatt tagtattatt 

7141 ccagtcaagc get gat t tat caaagaaatt tggtactaaa aaagagccaa acttagatac gtcaaaaggt 

7211 acgatttatg acaatatcac atcaccagtc aacttatacg ttatggaata tggtgacttt attaacttta 

7281 tggataaaat gagtgectat ccatggatta egcaaaaett tcaaaaggtt caaatgttac ctaaagactt 

7351 tattaataca aaagacttag aggacgttaa aaccagtgaa aaaattacag gattaaaaac attaaaacag 

7421 ggtggtaaat caaaagaatg gagtctaaaa gat t tat cat taagtttctc aaatcttcaa gagatgatgt 

7491 tatctaaaaa agatgaattt aaacatatga tacgtaatga gtatatgaca attgaatttt atgactggaa 

7561 tggaaatacg atgttactcg acgctggtaa gatttcacaa aaaactggtg ttaagttacg tacaaaatca 

7631 attattggtt atcataatga agttcgagta tatccagtag attataacag tgctgaaaac gacagaccaa 

7701 tactegctaa aaataaagaa atattgattg ataegggtte attcttaaat acaaatataa catttaatag 

7771 ttttgcacaa gtaccaatat taatcaataa tggtatctta ggacaatcac aacaagecaa ccgacaaaaa 

7841 aatgcagaaa gtcaattaat tacaaatcgt attgataatg tattaaatgg tagcgacccg aaatcacget 

7911 tttatgaege tgtgagtgta gcaagtaatt taagtccaac tgctttattt ggtaagttta atgaagaata 

7981 taatttctac aaacaacaac aagctgaata taaagattta gccttacaac caccttctgt aactgaatca 

8051 gaaatgggca acgcattcca aattgcgaat agcattaacg gtttaacgat gaaaattagt gtaccgtcac 

8121 ctaaagaaat tacattttta caaaaatatt atatgttgtt tggttttgaa gtgaatgact ataattcatt 

8191 tattgaacca attaacagta tgactgtttg caattattta aaatgtacag gtaegtatae tataegtgae 

8261 atcgacccca tgttaatgga acaattaaaa gcaattttag aatctggtgt aagattttgg cataatgacg 

8331 gttcaggtaa tccaatgtta caaaatccat taaataacaa atttagagag ggggtataat atgaacgaag 

8401 taaaattcag atttacagac teagaagegt ttcacatgtt tatatacget ggggatttaa aattactcta 

8471 ctttttattt gtattaatgt tegttgatat tattacaggt atttcaaaag caattaaaaa taataactta 

8541 tggtcaaaaa aatcaatgag aggattttct aaaaaattat tgatattctg tattatcatt ttagcaaaca 

8611 tcattgacca gattttacaa ttaaaaggtg gtctactcat gattacaata ttttattata ttgcaaatga 

8681 gggactttct attgtagaaa attgtgcaga aatggacgta ttagtaccag aacaaattaa agataaatta 

8751 agagtcatta aaaatgatac tgaaaagagt gataacaatg aacgatcaag agaagataga taaatttacg 

8821 cattcctata ttaatgatga ttttggttta acgatagacc agttagtccc taaagtaaaa ggatatgggc 

8891 gctttaatgt atggcttggt ggtaatgaaa gtaaaatcag acaagtatta aaagcagtaa aagagatagg 

8961 tgtttcacct actctttttg ccgtatatga aaaaaatgag ggttttagtt ctggacttgg ttggttaaac 

9031 cataegtctg cacgtggtga ttatttaaca gatgetaaat tcatagcaag aaagttagta tcacaatcaa 

9101 aacaagctgg acaaccgtct tggtatgacg caggtaacat cgtccacttt gtaccacaag aegtacaaag 

9171 aaaaggtaat gcagattttg caaaaaatat gaaagcaggt acaattggac gtgeatatat tccattaaca 

9241 geagctgeta ettgggegge atattatcct ttaggtttga aagcatcata taacaaagta caaaactatg 

9311 gtaatccatt tttagacggt gcgaatacta ttctagcttg gggtggtaaa ttagacggta aaggtggatc 

9381 acctagtgat tegtctgaca gtggtagtag tggtgacagt ggtagttcac tactegcttt agcaaaacaa 

9451 gecatgeaag aattattaaa aaaaatacaa gaegcattae aatgggacgt tcatagtatt ggtagtgata 

9521 aattttttag taatgattat tttacattag aaaaaacatt taacaacaca tatcatatta aaatgacgat_. ~ - 

9591 tggtttactt gat teat taa aaaaactgat tgatagcgtt caagtagata gtgggagtag tagttceaat 

9661 cctactgatg atgaeggaga ccataaacca attagtggta aatcagtcaa gccaaatgga aaaagtggtc 

9731 gtgtgattgg tggtaactgg acatatgeae agttaccaga aaaatataaa aaagcaattg gtgtaccttt 

9801 attcaaaaaa gaatacttat acaaaccagg taacatattt cctcaaacgg gtaatgeagg acaatgtaca 

9871 gaattaacat gggegtatat gtcacaacta catggtaaaa gacaacctac egacgaeggt caaataacaa 

9941 aeggtcageg tgtatggtac gtctataaaa agttaggtgc aaaaacaaca cataatccaa cagtaggtta 

10011 tggtttctct agtaaaccac catacttaca ageaactgea tacggtattg gtcacacagg tgttgttgta 
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10081 gcagtttttg aagatggttc gtttttagtt gcaaactata atgtaccacc atatgttgca ccatcacgtg 

10151 tggtattgta tacactcatt aatggcgtac caaataatgc tggtgataat attgtattct ttagtggtat 

10221 tgcttaatta actatgctat aatgaacaca tgctagtaat gctagtaaat aaaatacaaa acataatcaa 

10291 ttttcgtaca cat ttt teat gttatctcaa aaagaaaagg agactgttat tttaacagtt gecttttttt 

10361 atttcatcac gttcacgttt taatatatgc aaatcagatc tgttatgtac tgaacgttca actggaaata 

10431 agtcgttaag tgaaaatgaa ccgatgccac tttcaatata aagaacatca tcaaattgac tatggtcgaa 

10501 attttcccta gegtctttta atataaattc aegtctcata ttaagttcat cagtaaaata ttcatcatat 

10571 acattaccac atacaatctc agttttagac ggatatatcg atattgtacc ttgetcatta tagatacttt 

10641 tattgttttc aataatggca cegtcaaaga attgttcacg tacaaaggct tcaaaaccga cgcttgtatc 

10711 aaaggcgttt tteggtatae cagcagaagc aattttaatc tttccattca ettcatatge atatttctta 

107B1 tgattcagta caaacatctt atctatctgt tcgttttcaa tatcccattt acctaaggct ategggtega 

10851 ataaactggg gttcaataag ggtttaacaa eggatttcat atacaaacta teagtatege aataaataaa 

10921 attgtegtea atttcacttt ccgttaagta ttggaaagga accaataagt tatacaatga acgtgatgtg 

10991 acaaatgtag agaataatat attaegttea gtgtttttgt aaccgttaat gatattgtat agttcattgt 

11061 tatcatctaa aeggaataag ttaaaacgtg aacgtaatgc aggtatgeca tataatccat ttaaaacgac 

11131 tttagataac ataacctcct catttgagta tgggtgttcg ttgatatcat cagtaatgtg atagtegtaa 

11201 ggtgatgtca tattgatttt gttttttaac ttaccttgtg ttttaataaa atagttttga aaaataatat 

11271 cacgtgcatg aaagtattca cat teat at a taacaaacga attaacacgt atatgeatge aatcaatacc 

11341 cgtaatgtct tgaatcattc ttaatgtatt tgtattgata ttaacgtaat cattatcatt attatagtat 

11411 tttacaatca tttgacgtaa tacaegtgat ttaattttaa ttaataaatc atcgttaaat acatctttat 

11481 caatcttata taatgaaaaa taattgtcat catctaaaaa agtagggatt aacgttggtt ctgaatagtg 

11551 ttcgtaaaag tataaccatg ttggaatttt ttcatgatac atcacataag gataactcga attgatgtca 

11621 atagaaaaac aaggctcatc aattagtttg tttatgtatt tggtgttata catatttaaa ccaccacgat 

11691 agaatgattt aatatagtca taaaaattca tatcatggaa atgataatgt gtataagata ttttaatatc 

11761 ttgatattgg ttgagtaact gaaaacgtgt catttcatta ttcaagtaag attccataat attcaatgaa 

11831 aatgttaatt tgttatagtc aaaatttgga aatatatcac tataatgaat atggcacata cctaatataa 

11901 teaegtcatt atgaatgtat gtaagttgtt caggtgtgag ttttgcaaaa catttcacag catagtcata 

11971 ggcttcacta tcattcatat cattatcttt atcaaaaatc gtataattaa aatctgtttt aagttgtgat 

12041 tctgttaaat aaccaccatc aagtaatttc ttacctaatg ttgcaattga tgtattggtt ttcataaagt 

12111 tatcaataat attaaattta aaaccattta aaaacattgt taaatctaaa ttgattgaag atttaacacg 

12181 tttttctaaa attacatttt gatttttggc taaaatagta gcctctttca tttttaatgt gtgttcattt 

12251 tettctgeag attttaaata tatattttcg cgtgtaatat tatcaaaata acgcatggtg tctttaagta 

12321 aaaaatgatt ategtattta ttacagttat gtgeaatcat gataatatct gtttttgatt ttgtgattgt 

12391 ateaegtett ttcacatacg tataaaatgc gtcataaaaa gattcgaaac teggaaatae ttcaacatca 

12461 atttcataac cattaaacca accaattget acagaataag taacgttttt atatttggtt ggtttttttc 

12531 gtccgttaac tttattgtac gctaatgttt ctatatccca gtataaaatc attcgaegtt catgtttatg 

12601 atattgeatg cattctagta atcccataat cttacacacc ttttataagc catattgttt cattagatac 

12671 tttttegtat tctctatata gttatcttcg tatatttttt cttttctttc aaactcactc atatttttct 

12741 tcatttcatt ttttatatga aattttataa ttttattcat atctaaatat aaatatctat cattatcaac 

12811 caegtaattt ttagagtaag cattgtcaaa atgtaaattg cttggattgt agtaataacg ttccatgttt 

12881 tctttataaa acatatcatc aegtaaatag gtaacatgat tgtctatatc cctaatttta gtacaaaatt 

12951 catattgttt tgtatatggt acaacgataa tatttgtcat aaaagtagtt acattataca tgactttaat 

13021 atatttatca tcagttttga tatagaagaa atcaccgttt tgattgatgt gatttcttaa attatcatcc 

13091 gecaaattat attcgttaaa ttcaaattct ccagttgtca tagegtegtc atttgaatta aacgeaegtg 

13161 tgttacgttt ttcattcacg taatcgtttc gtegcattte taaaaaaatg tttttgtaaa gtcttgatgt 

13231 attcatttta tgcttttgta ataaattgta tatatttaaa ttggataata taggacttga aaagttgact 

13301 gcattaccta gtaaaaacat tttagggaat ccaatataat caacgttacc atggttacgg tcgattgatt 

13371 catatattgt ttttaactta tcccactcat caattaaata atcatcttca agtgctaaaa actcatcata 

13441 tataataata ggatagtgtt ttaaaaagtt agaatgatat tttaaatcag tggcactatt caaatctgta 

13511 atcacaccaa tttctttatc ttgatagata atagctaaat agtccctagc acttctgaac gtgacacgtt 

13581 ttgatttaaa tagtggattt tcatctatga tttcttcaat aaaatcaegg taagegtcac gtaatgtata 

136S1 atgacgtgat aataaagtaa attttatatc aagtttaata gctaaataaa taaaaaatga aacatagttg 

13721 aacgattttc catcagaacg gtttgaaata gatatataat aatctatatc atcattcata agttcatcaa 

13791 ctaattctat ttgattatac ttatctggga ttttttttct gacatgattg acagcatttt gataatctct 

13861 taccatgtct aaacgatttt gttttaccat gtttttgetc cttgtaatag tttatgatgt cgtttacagt 

13931 gttaaattta ttcgtcaaat gttgcataat ataaaaagtt atacctcaca tcttcatcat caatatttgt 

14001 cactggtcta tctgatttac caatttcttt atataaagta tcgatttctt taatatattt atacattgaa 

14071 gaattattat ttttagcttg taaattatat aaagegtatt tatgettttt agegttttta ttattagaat 

14141 catcattacg gttatatatt tcaagaatat aatttaattt tttatgtctt gaacctctta ccaatgatac 

14211 agcatttaca tatgatacgt ttctttcttt aggaaaatag ggcagatgtg caaaatgttt ccatgtgtca 

14281 atgtacgcct cttgtaaatc tttatcatca aatttaaaat taacattact aaaatcattt aaaaataaat 

14351 ctttttcttg ctcttttcta gcttctcttt cttttttcca tctatccatt teagaegtat gtctaaccaa 

14421 tgttatcaac ctccatataa agcataaata accattaaaa agataatata gaatataatc aatgtagtga 

14491 ataaaacacc aaatgacacg egtatatgea gtgtcataag tatgataagt gtaattaaaa atgctaaaag 

14561 gaaaacaatg gctatgttta ataggttatt catggtcaat cactttccca ttategtata tgactttgtt 

14631 ttgataaata atcattaatt cgctttcaag aggtttatca aaatttgata ataegtegtc aattgtaacg 

14701 tttaataaaa tttctcttat taattcatta cttaaataat ttctataata aaatacaagt atattaaaaa 

14771 catgtttttt aatatcaatg tcgatatcta aegtaaataa ctctttttca atttcaaaat catcatattg 

14841 tttgtcaaac tcaatataca catcacccat atttattttt actatacatt ttttattaga tgaagtaaat 

14911 ttttcaaatt tatcattata ataatctcta tttgttaaaa ggtaataaat taaattattt aatctaaasg- 

14981 tagttttaat tttcattttt atatctcctt aatgtattct atgatatacg cgtatttttt agtgaacagg 

15051 ttatattcat aatatgaata tacaacttta gegtcatata aatcttcaaa cattgagatt tgatgtggaa 

15121 aatgtccttt aatctcatcg caatataata ataccgtttt gtatttacgt tccatttaaa cacctcataa 

15191 aaaatagggg ataagtatcc cctatgaaat tgtattaaaa tgatacttga ccaaaattga ttgagtaacc 

15261 tttttgacct tttttgtttt catattcata aattgtgaat tgaacttctc cagcattgat aatgtcaaca 

15331 acgtcctcat ctgctctcat ttctttaatt aattctgtta agtggttcgg taagtttacg ttatagtcat 
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15401 cagtgacgat aacaccttgt tcaccgaatt 

15471 ttttttcata ccgtattttt ctactaattc 

15541 aatctcgcta atgtgttttg gtgtcttgat 

15611 ttaaattatt tgctttctgc aattgcgatt 

156 81 tgcgtgtagt ggacaatagt ttacatgtgt 

15751 ctcgtgaagt ggtaaaaatt cctcaatgta 

15821 acacgtaagg taacaatgtc gtcaactttc 

15891 cgtttcataa aatcctttat gcatattcca 

15961 gattctggtt tagtttcgtt gtttagttca 

16031 atagttgttg gcaagccgat aataagttaa 

16101 tttattgaat agttgcaaca tttcagtata 

16171 attattatca cttcctaata aagttgaaat 

16241 tcaatgtcaa catcataaaa tgaaatttca 

16311 tcttaaaacg aaaaacatgc ttcaactcaa 

16381 tgattacata cttagtatag caaacgttta 

16451 ttttaaaact actatttaat agaagaaata 

16521 agatacataa attttgtatt tgatgaatat 

16591 ttttaagttt agtaaagaaa tgataagtaa 

16661 ggtggggt 
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ttgattcttt gtttgtgaat aatgctctaa cgatatactc 
tgatagtttg ataaattctc tttctttttc ctcaaattca 
aaaatatctt ttacgtttgt cattttattt ctcctcttat 
tgtagtaaat cattgtaata aacttgaatt gttttcgttg 
ctggtaataa ttcttttgct tgtgttttgg ttaaatgata 
ttcattatca tcatctaagt aatgaagtat ataacctttg 
attattatat cactcctttc taaaaaacgt aaacgttata 
ttgttctatt gggtcatcac cagcaatata agacaatatt 
tcatttaaga attgaacaac agaactatta tagtttaata 
ttgcattgtc aaatgtataa gctggattcc attgaatcag 
ggcttgtcct ttttcttctg gtgcattatc aacattaacc 
tacgcgtaaa acagaattat gatttaaatc ttcaatttca 
ttttctgttc tatcaaataa cgctatacat aaacttccat 
tgttttttgt ttcattttcc atttttgtta ctccttgttt 
aaagttttgt caatagtttt tcttaaaaaa gtttaaataa 
agattttaag ttcaaatcat aattttgaat aaaagtcaat 
gtaataggtt agacaagttg gttaagttgt tgcacagtat 
atttataagt tttgatttgt ataatcgttt attttaaacc 
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Table 17 



Phage 44AHJD ORFs list 



nb 




Frame 


Position 


Sizo (s.a.) 


Key words 


1 


44AHJDORF001 


-1 


10342.. 12627 


761 


DNA polymerase; 


2 


44AHJDORF002 


3 


3789 5732 


647 


Techoic acid; Staph; 


3 


44AHJDORF003 


2 


6626 8389 


587 


Tall; 


4 


44 A HJ DOR P004 


1 


8764 10227 


487 


S^nnft nrntP3*»o mftbr 


5 


44 am innocftftc 


.-| 


12643 13890 


415 




g 


if au innRFOOR 


2 


803 .2029 


408 




7 


44AM inORpnOT 


4 

i 


9044 3097 


327 


I innnr onflar 


a 
o 


iM AU inODCftftQ 
•rfnrw UU r\ rvUO 


4 


3090 *?77R 


251 


1 ftwpr cftjlar* 

Lwrvvl \AJiKll t 


Q 
9 


44AM innppnnQ 


9 
4 


R744 R4QC 
Of *rfnDHW 


250 


Amirt^QA' 5>tanh* 


in 


44am innRPmn 


-9 
•4 


1^03 A 14490 
I0900.. iWtw 


160 




44 

1 1 


44MnJUvr\rUl 4 


o 
O 


A1Q1 AA10 
DO 3 1 ..OO fO 


140 


HoJin* 


19 


44am tnnppmo 
44Mnouwrtrui o 


-4 


14«>Aft 14QQA 
I4oOD.. H99u 


13R 
I OO 




10 
10 


44AM tnODC1 1 0 


4 

1 


4QQ fiftft 


133 

I OO 




4il 

14 


44am innocm 1 


-4 


1 *%99*\ 1 *i(sQ0 
1 0440. . 1 0090 


199 

1 44 




4C 


i i AU inftDCi 4 i( 

44/\rUUvr\r1 14 


o 

-4 


100/ U..101 /4 


100 




4fi 


i i am inADcn 4 a 
44AnJDORP0 1 4 


3 


0240.. 004 1 


Q9 
94 




17 


44AHJDORr01 5 


1 


4CJAA iecic 

10403.. 10O40 


^ Oft 

oU 




18 


a a au inAnrnic 

44/vtJDORFOI 0 


-1 


4CC4C 4 COCO 

Ioo1o..iooo2 


7Q 

to 




4ft 

19 


J J AN ir\/^o eft 4 "7 


! -2 


4 ftCOC 4 ft7 117 

iuooo..i0f Of 


70 
( 0 




*lft 


44AHJDUKr01 o 


-1 


ODC 4Aftfi 


7ft 




21 


44AHJDORF01 9 


-2 


9oo0..9ooo 


bo 




22 


44ArWDORr 121 


-1 


4C4CC 4 COCO 

101 DO.. 10002 


00 




Z3 


44AnjOORr0Z0 


2 


4 0QCC 4/1 ft CO 

10O0O..14000 


04 




Oil 

44 


y4 A i| mftDC40'S 

44/\nJ UL/Kr 1 40 


2 


£4 if 7QC 

014. .raO 


RO 
OU 




9C 

40 


44 ArwLHJK rU2 1 


-4 


0004. .OO 10 


RO 
OU 




oc 


44ArUl_XJKrU40 


-4 


CO 4 C ttAQA 

0010..0494 


RQ 

09 




4f 


44 AMJ LXJRrOZ4 


4 

1 


1 44 f 0.. 14401 


RA 
OO 




OQ 
40 


A A Al_l IHADCAIC 

44AnJUUKrQ20 


•3 


4 it Oft ft 4C4 7C 

14999.. 101 1 0 


RA 
00 




Oft 


44AHJUORr02D 


-3 


4ilitOfi 4 it COO 

14440.. 14090 


RR 
00 




Oft 

OU 


it it AU !P\ADCA07 

44ArwLKJr<rU27 


4 


14910..10U0U 


R4 
04 




04 
01 


44/\nJUUKr029 


A 
-1 


1CA.1Q 1C1A0 
10U19..10100 


R4 




oo 


>i it a u inrtDcnoo 
44/VlJUwKrU40 


-0 


Qft71 Q90C 
Wf 1..940O 


R4 




oo 
00 


i i au trv%DCftOft 
44ArwUv/KrU0U 


o 
o 


AAAQ.7 14A4A 
1440f ..14040 


CO 
OO 




Oil 

04 


ifilAkJ irV^kDCft04 

44AT1 J UUKr U0 1 


9 
4 


44AOO 11101 
1 1U09..1 1 iyi 


R0 

OU 




OR 


44MnJuvJKP 1 0O 


0 


CQO RAO 
090..044 


4Q 




OR 

oo 


44AM jrWIPPAOO 


4 
- 1 


OA4A 07QC 


49 




07 
Of 


44 am innppno9 


9 
-4 


ft QftR 04CC 


49 




OA 
00 


44MT»JLWJr\rU04 


o 

-0 


14AAA 1414ft 
14UUU..14140 


4ft 




OQ 
09 


44AM irVtPPAOR 
44HTWU\jr\rU0O 


o 
-o 


4 004 4 40QC7 
IOO 1 1 .. 1 090/ 


4fi 




4U 


44 am mnocmfi 


o 
-o 


Ivv 19.. Ill 1 DO 


48 




41 


44 AM inoppn99 
44/\njuwr\r U44 


o 

•0 


A4AA AR11 
0400..001 1 


47 




49 
«r4 


44am mnppno.7 

44MTI OU UrtrUO i 


4 
1 


14 7ft A 14Q31 


47 




43 

*»o 


44AM irVlRPAOA 
44MnOUwf%rVOO 


9 
-4 


0040..0V' 1 


47 I 




44 
44 


44AM mORPAOO 
44MnJL/UrtrU09 


o 
0 


1743 1AA3 

1 f HO.. IOOO 


46 




40 1 


44AM irV)PPA4A i 
44MnJU«Jr\TU4U 


4 


074ft QA77 


4R 

**o 




4R 
40 


44AM irVlPPA41 
44MrwLWJf\PV4 1 


9 
4 


10000..109/ 0 


4R 

«tO 




4f 


44AM irtf"lPPA49 


4 

-1 


OU I4..0 IO ) 


45 




48 


44AHJDORF043 


-1 


4402..4539 


45 




49 


44AHJDORF044 


-2 


12783..12917 


44 




50 


44AHJDORF149 


-2 


639..770 


43 




51 


44AHJDORF046 


1 


4891. .501 9 


42 




52 


44AHJDORF047 


1 


11911..12039 


42 




53 


44AHJDORF045 


2 


10655..10783 


42 




54 


44AHJDORF048 


-3 


15212..15340 


42 




55 


44AHJDORF049 


3 


5784..5909 


41 




56 


44AHJDORF050 


3 


13158..13283 


41 




57 


44AHJDORF051 


-2 


10944..11066 


40 




58 


44AHJDORF052 


-3 


14216..14338 


40 




59 


44AHJDORF053 


3 


3348..3467 


39 




60 


44AHJDORF054 


3 


7551. .7670 


39 




61 


44AHJDORF055 


3 


15705.. 15821 


36 




62 


44AHJDORF056 


1 


5512..5625 


37 




63 


44AHJDORF057 


2 


10121..10231 


36 




64 


44AHJDORF058 


3 


10767..10877 


36 
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65 


44AHJDORF164 


-1 


592..702 


36 




66 


44AHJDORF059 


-2 


8250..8360 


36 




67 


44AHJDORF060 


-2 


6147..6257 


36 




68 


44AHJDORF061 


2 


15551..15658 


35 




69 


44AHJDORF062 


1 


42S5..4389 


34 




70 


44AHJDORF063 


-3 


9381.9487 


34 




71 


44AHJDORF065 


1 


5029.5130 


33 




72 


44AHJDORF064 


2 


2609..2710 


33 




73 


44AHJDORF066 


-2 


10380..10481 


33 
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Table 18 

Predicted amino acid sequences 

44AHJDORF001 

12627 atQQQattactagaatgcatgcaatatcataaacatgaacgtcgaatgattttatactgggatatagaaacattagcgtacaat 

1 mg lLECMQYHKHERRMILYWDIETLAYN 

12543 aaagttaacggacgaaaaaaaccaaccaaatataaaaacgttacttattctgtagcaattggttggtttaatggttatgaaatt 

12459 aatqttgaagtattt^gagtttcgaatctttttatgacgcattttatacgtatgtgaaaagacgtgatacaatcacaaaatc^ 

12459 gatgctg ^ p p g FES FYDAFYTYVKRRDTITKS 

12375 aaaacaaatattatcatgattgcacataactgtaataaatacgataatcattttttacttaaagacaccatgcgttattttgat 

85 KTDI iMlAHNCNKYDMHFIiLKDTMRYFD 

12291 aatattacacgcgaaaatatatatttaaaatctgcagaagaaaatgaacacacattaaaaatgaaagaggctactattttagcc 

113 nitR EMIYLKSAEENBHTLKMKEATILA 

12207 aaaaatcaaaatgtaattttagaaaaacgtgttaaatcttcaatcaatttagatttaacaatgtttttaaatggttttaaattt 

141 knQNVILEKRVKSSINLDLTMFLNGFKF 

12123 aatattactgataactttatgaaaaccaatacatcaattgcaacattaggtaagaaattacttgatggtggttatttaacagaa 

l69 N I IDNFMKTNTSIATLGKKLLDGGYLTE 

12039 tcacaacttaaaacagattttaattatacgatttttgataaagataatgatatgaatgatagtgaagcctatgactatgctgtg 

197 SQIjKTDFNYTIFDKDNDMNDS eaydyav 

11955 aaatqttttgcaaaactcacacctgaacaacttacatacattcataatgacgtgattatattaggtatgtgccatattcattat 

225 KC FAKLTPEQLTYIHNDVI I LGMCHI HY 

11871 agtgatatatttccaaattttgactataacaaattaacattttcattgaatattatggaatcttacttgaataatgaaatgaca 

253 S D IFPNFDYNKLTFSLNIMESYLNNEMT 

11787 cqttttcagttactcaaccaatatcaagatattaaaatatcttatacacattatcatttccatgatatgaatttttatgactat 

281 RFQ LLNQYQDIKISYTHYHFHDMNFYDY 

11703 attaaatcattctatcgtggtggtttaaatatgtataacaccaaatacataaacaaactaattgatgagccttgtttttctatt 

309 IKSFYRGGLNMYNTKYINKLIDEPCFSI 

11619 9acaccaattcg^ 

11535 acattaatccctacttttttagatgatgacaattatttttcattatataagattgataaagatgtatttaacgatgatttatta 

365 t liptFLDDDNYFSLYKIDKDVFNDDLL 

11451 attaaaattaaatcacgtgtattacgtcaaatgattgtaaaatactataataatgataatgattacgttaatatcaatacaaat 

393 I K IKSR VLRQMIVKYYNNDNDYVNINTN 

11367 acattaagaatgattcaagacattacgggtatcgattgcatgcatatacgtgttaattcgtttgttatatatgaatgtgaatac 

421 TLR M I QD ITG I DCMH IRVNS FV I YECEY 

11283 tttcacgcacgtgatattatttttcaaaactattttattaaaacacaaggtaagttaaaaaacaaaatcaatatgacatcacct 

449 pHARDIIFQNYFIKTQGKLKNKINMTSP 

11199 tacgactatcacattactgatgatatcaacgaacacccatactcaaatgaggaggttatgttatctaaagtcgttttaaatgga 

477 YDYHITDDINEHPYSNEEVMLS KVVLNG 

11115 ttatacggcatacctgcattacgttcacattttaacttattccgtttagatgataacaatgaactatacaatatcattaacggt 

505 LYG I PALRSHFNtiFRLDDNNE LYNI I N G 

11031 tacaaaaacactgaacgtaatatattattctctacatttgtcacatcacgttcattgtataacttattggttcctttccaatac 

533 YKNTERNILFSTFVTSRSLYNLLVPFQY 

10947 tcaacggaaagtgaaattgacgacaattttatttattgcgatactgatagtttgtatatgaaatccgttgttaaacccttattg 

561 LTESEIDDNFIYCDTDSLYMKSVVKPLL 

10863 aaccccagtttattcgacccgatagccttaggtaaatgggatattgaaaacgaacagatagataagatgtttgtactgaatcat 

589 NPSLFDPIALGKWDIENEQIDKMFVLNH 

10779 aagaaatatgcatatgaagtgaatggaaagattaaaattgcttctgctggtataccgaaaaacgcctttgatacaagcgtcgat 

617 K KYAYEVNGKIKIASAGIPKNAFDTSVD 

10695 tttgaaacctttgtacgtgaacaattctttgacggtgccattattgaaaacaataaaagtatctataatgagcaaggtacaata 

645 fItfvreqffdgaiienhksiyneqgti 

10611 tcgatatatccgtctaaaactgaaattgtatgtggtaatgtatatgatgaatattttactgatgaacttaatatgaaacgtgaa 

673 S I ypSKTEIVCGNVYDEYFTDELNMKRE 

10527 tttatattaaaagacgctagagaaaatttcgaccatagtcaatttgatgatattctttatattgaaagtgacatcggttcattt 

701 FILKDARENFDHSQFDDILYIESDIGSF 

10443 tcacttaacgacttatttccagttgaacgttcagtacataacaaatctgatttgcatatattaaaacgtgaacatgatgaaata 

729 SLNDLFPVERSVHNKSDLHILKREHDEI 

10359 aaaaaaggcaactgttaa 10342 

757 K K G N C * 

44AHJDORF002 , . 

3789 atggcatataatgaaaacgattttaaatatctcgatgacattcgtccatttttagacgaaatttataaaacgagagaacgttat 

! MAYNENDFKYFDD I RPFLDE I YKTRERY 

3873 acaccgttttacgatgatagagcagatcataatactaattcaaaatcatattatgattatatttcaagattatcaaaactaa^t 

29 TppYDDRADYNTNSKSYY DYI S R L S li"lr I 

3957 qaagtattagcacgtcgtatttgggactatgacaatgaattaaaaaaacgtttcaaaaattgggacgacttaatgaaagcattt 

57 E V LAR R I WDYDNELKKRFKNW D DLMKAF 

4041 ccagagcaagcgaaagacttatttagaggttggttaaacgacggtacgattgacagcattattcatgacgagtttaaaaaatat 

85 peqakBlfrgwlndgtidsiihdefkky 

4125 aqcqcaggattaacatcggcatctgctttatttaaagttactgaaatgaaacaaatgaatgactttaaatcagaagttaaagac 

113 s aglts afalfkvtbmkqmndfksevkd 

4209 ctaactaaagatattgaccgtttcgttaatgggtttgaatcaaatgagcttgaaccaaagtttgtgatgggctttggtggtatt 
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141 LIKD I DRFVNGFELN-ELEPKFVMGFGGI 

4293 cgcaacgcagttaaccaatctattaatattgataaagaaacaaatcacatgtactctacacaatccgattctcaaaaacctgaa 

169 RNAVNQSINIDKETNHMYSTQSDSQKPE 

4377 ggtttttggataaataaattaacacctagtggcgacttaatttcaagcatgcgtattgtacagggtggtcatggtacaacaatc 

197 GFWINKLTPSGDLISSMRIVQGGHGTTI 

4461 ggattagaacgtcaatccaatggtgaaatgaaaatctggttacatcacgatggtgttgcaaaactgttacaagtcgcatataaa 

225 GLERQSNGBMKIWLHHDGVAKLLQVAYK 

4545 gataattatgtattagatttagaagaggctaaaggtttaacagattatacaccacagtcacttttaaacaaacacacatttaca 

253 DNYVLDLBB AKGLTDYTPQSLLNKHTFT 

4 62 9 ccgttaattgatgaagcaaatgacaaactcatcttaagattcggtgacggaacaatacaggttcgt tcaagagcagacgtaaaa 

281 PLIDEANDKLILRFGDGTIQVRSRADVK 

4713 aatcacattgataatgtagaaaaagaaatgacaattgataattcagaaaacaatgataatcgttggatgcaaggcactgctgtt 

309 NHIDNVEKBMTIDNSENNDNRWMQGIAV 

4797 gatggtgatgatttatactggttaagtggtaacagttcagttaattcacatgttcaaatcggtaaatattcattaacaacaggt 

337 DGDDLYWLSGNSSVNSHVQIGKYSLTTG 

4881 caaaagatttatgattatccacttaagttatcatatcaagacggtattaatttcccacgtgataactttaaagagcctgagggt 

365 QKXYDYPFKLSYQDGINFPRDNFKEPEG 

4965 atttgcatttatacaaatccaaaaacaaaacgtaaatcgttattacttgctatgacaaacggcggtggtggaaaacgtttccat 

393 ICIYTNPKTKRKSLLLAMTMGGGGKRFH 

5049 aatttatatggtttcttccaacttggtgagtatgaacactttgaagcattacgcgcaagaggttcacaaaactataaattaaca 

421 NLYGFFQLGEYEHFEALRARGSQNYKLT 

5133 aaagacgacggtcgtgcattatctattccagaccatatcgacgatttaaatgacttaacgcaagctggtttttattatattgac 

449 KDDGRALSI PDHIDDLNDLTQAGFYYID 

5217 gggggtactgcagaaaaacttaagaatatgccaatgaatggtagcaagcgtataattgacgctggttgtttcattaatgtatac 

477 GGTAEKLKNMPMNGSKRIIDAGCFINVY 

5301 cctacaacacaaacattaggtacggttcaagaactaacacgtttctcaacaggtcgtaaaatggttaaaatggtgcgtggtatg 

505 PTTQTLGTVQELTRFSTGRKMVKMVRGM 

5385 actttagacgtatttacgttaaaatgggattatggattatggacaacaatcaaaactgacgcaccatatcaagaatatttggaa 

533 TLDVFTLKWDYGLWTTI KTDAP YQEYLE 

5469 gcaagtcaatacaataactggattgcttatgtaacaacagctggtgagtattacattacaggtaaccaaatggaattatttaga 

561 ASQYNNWIAYVTTAGEYYITGNQMELFR 

5553 gacgcgccagaagaaattaaaaaagtgggtgcatggttacgtgtgtcaagtggtaacgcagtcggtgaagtaagacaaacatta 

589 DAP EE IKKVGAWLRVSSGNAVG EVRQTL 

5637 gaggctaatatatcggaatataaagaattcttcagtaatgttaatgcggaaacaaaacatcgtgaatatggttgggtagcaaaa 

617 EAN I S EYKEFFSNVNAETKHRE YGWVAK 

5721 catcaaaaatag 5732 

645 H Q K * 
44AHJDORF003 

6626 atgagaaagtcaacgaattttaagtttttctataacacaccgtttacagactatcaaaacacgattcattctaatagtaataaa 

1 MRKLTNFKFFYNTPFTDYQNTI HFNSNK 

6710 gaacgtgatgattattttttaaatggtcgtcattttaaatcgttagactattcaaaacaaccgtataattttatacgtgataga 

29 ERDDYFLNGRHFKSLDYSKQPYNFIRDR 

6794 atggaaatcaatgttgatatgcagtggcatgacgcacaaggtattaactacatgacgtttttatcagattttgaggatagaaga 

57 MEINVDMQWHDAQG I NYMTFLS D FEDRR 

6878 tattacgcttttgtaaaccaaatcgaatacgtgaatgacgttgtggttaaaatatattttgtcattgataccattatgacgtat 

85 YYAFVNQIEYVNDVVVKIYFVI DTIMTY 

6962 acacaagggaatgtattagagcaactctcaaacgtcaatattgaacgtcaacatttatcaaaacgcacgtataactatatgtta 

113 TQGNVLEQLSNVNI ERQHLSKRTYNYML 

7046 ccaatgttacgtaataatgatgatgtgttaaaagtatcaaataaaaactatgtttataaccaaatgcaacaatatttggaaaat 

141 PMLRNNDDVLKVSNKNYVYNQMQQYLEN 

7130 ttagtattattccagtcaagcgctgatttatcaaagaaatttggtactaaaaaagagccaaacttagatacgtcaaaaggtacg 

169 LVLFQSSADL SKKFGTKKEPNLDTSKGT 

7214 atttatgacaatatcacatcaccagtcaacttatacgttatggaatatggtgactttattaaccttatggataaaatgagtgcc 

197 IYDNITSPVNLYVMEYGDFINFMDKMSA 

7298 tatccatggattacgcaaaactttcaaaaggttcaaatgttacctaaagactttattaatacaaaagacttagaggacgttaaa 

225 YPWI TQNFQKVQMLPKDFINTKDLEDVK 

7382 accagtgaaaaaattacaggattaaaaacattaaaacagggtggtaaatcaaaagaatggagtctaaaagatttatcattaagt 

253 TSBKITGLKTLKQGGKSKEWSLKDLSLS 

7466 ttctcaaatcttcaagagatgatgttatctaaaaaagatgaatttaaacatatgatacgtaatgagtatatgacaattgaattt 

281 FSNLQEMMLSKKDEFKHMIRNEYMTIEF 

7550 tatgactggaatggaaatacgatgttactcgacgctggtaagatttcacaaaaaactggtgttaagttacgtacaaaatcaatt 

309 YDWNGNTMLLDAGKI SQKTGVKLRTKS I 

7634 attggttatcataatgaagttcgagtatatccagtagattataacagtgctgaaaacgacagaccaatactcgctaaaaataaa 

337 IGYHNEVRVYPVDYNSAENDRP I LAKNK 

7716 gaaatattgattgatacgggttcattcttaaatacaaatataacatttaatagttttgcacaagtaccaatattaatcaataat 

365 EILIDTGSFLNTNITFNSFAQVPII*INN 

7802 ggtatcttaggacaatcacaacaagccaaccgacaaaaaaatgcagaaagtcaattaattacaaatcgtattgataatgtatta 

393 G I LGQSQQAN RQKNAESQ LITN R I D & V L - 

7886 aatggtagcgacccgaaatcacgcttttatgacgctgtgagtgtagcaagtaatttaagtccaactgctttatttggtaSgttt 

421 NGSDPKSRFYDAVSVASNLSPTALF-GKF 

7970 aatgaagaatataatttctacaaacaacaacaagctgaatataaagatttagccttacaaccaccttctgtaactgaatcagaa 

449 NEEYNFYKQQQAEYKDLALQPPSVTESE 

8054 atgggcaacgcattccaaactgcgaatagcattaacggtttaacgatgaaaattagtgtaccgtcacctaaagaaattacattt 

477 MGNAFQIANS INGLTMKISVPS PKEITF 

8138 ctacaaaaatatcatatgttgtttggttttgaagtgaatgactataattcatttattgaaccaattaacagtatgaccgtttgc 
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505 LQKYYMLFGFEVNDY'NS F I EP INSMTVC 

8222 aattatttaaaatgtacaggtacgtatactatacgtgacatcgaccccatgttaatggaacaattaaaagcaattttagaatct 

533 nylKCTGTYTIRDIDPMIjMEQLKAILES 

8306 ggtgtaagattttggcataatgacggttcaggtaatccaatgttacaaaatccattaaataacaaattcagagagggggtataa 

6389 

561 GVRFWHNDGSGNPMLQNPLNNKFREGV* 
44AHJDORFO04 

8764 atgatactgaaaagagtgataacaatgaacgatcaagagaagatagataaatttacgcattcctatattaatgatgattttggt 

1 H I L K R VITMNDQEKI DKFTHSY INDDFG 

8648 ttaacgatagaccagttagtccctaaagtaaaaggatatgggcgctttaatgtatggcttggtggtaatgaaagtaaaatcaga 

29 LT I DQLVPKVKGYG RFNVWLGGNESKIR 

8932 caagtattaaaagcagtaaaagagataggtgtttcacctactctttttgccgtatatgaaaaaaatgagggttttagttctgga 

57 QVLKAVKBIGVSPTLFAVYEKNEGFSSG 

9016 cttggttggttaaaccatacgtctgcacgtggtgattatttaacagatgctaaattcatagcaagaaagttagtatcacaatca 

85 LGWLNHTSARGDYLTDAKFIARKLVSQS 

9100 aaacaagctggacaaccgtcttggtatgacgcaggtaacatcgcccactttgtaccacaagacgtacaaagaaaaggtaatgca 

113 KQAGQPSWYDAGNI vhfvpqd vqrkgna 

9184 gattttgcaaaaaatatgaaagcaggtacaattggacgtgcatatattccattaacagcagctgctacttgggcggcatattat 

141 DFAKNMKAGTIGRAYI PLTAAATW A A Y Y 

9268 cctttaggtttgaaagcatcatataacaaagtacaaaactatggtaatccatttttagacggcgcgaatactattctagcttgg 

169 plglkasynkvqnygnpfldgantilaw 

9352 qqtggtaaattagacggtaaaggtggatcacctagtgattcgtctgacagtggtagtagtggtgacagtggtagttcactactc 

197 6 GKLDGKGG S PSD S SDSG S SGDSGSSLL 

9436 gctttagcaaaacaagccatgcaagaattattaaaaaaaatacaagacgcattacaatgggacgttcatagtattggtagtgat 

225 ALAKQAMQELLKKIQDALQWDVHSI G S D 

9520 aaattttttagtaatgattattttacattagaaaaaacatttaacaacacatatcatattaaaacgacgattggtttacttgat 

253 KFFSNDYFTLEKTFNNTYHIKMTIGLLD 

9604 tcattaaaaaaactgattgatagcgttcaagtagatagtgggagtagtagttctaatcctactgacgatgacggagaccataaa 

281 slkklidsvqvdsgssssnptdddgdhk 

9688 ccaattagtggtaaatcagtcaagccaaatggaaaaagtggtcgtgtgattggtggtaactggacatatgcacagttaccagaa 

309 piS G KSVKPNGKSGRVIGGNWTYAQLPE 

9772 aaatataaaaaagcaattggtgtacctttattcaaaaaagaatacttatacaaaccaggtaacatatttcctcaaacgggtaat 

337 KYKKAIGVPLFKKEYLYKPGNIFPQTGN 

9856 gcaggacaatgtacagaattaacatgggcgtatatgtcacaactacatggtaaaagacaacctaccgacgacggtcaaataaca 

365 AGQCTELTWAYMSQLHGKRQPTDDGQIT 

9940 aacggtcagcgtgtatggtacgtctataaaaagttaggtgcaaaaacaacacataatccaacagtaggttatggtttctctagt 

393 NG Q RVWYVYKKLGAKTTHNPTVGYGF 8 

10024 aaaccaccatacttacaagcaactgcatatggtattggtcacacaggtgttgttgtagcagtttttgaagatggttcgttttta 

421 KPPYLQATAYGIGHTGVVVAVFEDGSFL 

10108 gttgcaaactataatgtaccaccatatgttgcaccatcacgtgtggtattgtatacactcattaatggcgtaccaaataatgct 

449 VANYNVPPYVAPSRVVLYTLINGVPNNA 

10192 ggtgataatattgtattctttagtggtattgcttaa 10227 

477 GDNIVFFSGIA* 

13890 atggtaaaacaaaatcgtttagacatggtaagagattatcaaaatgctgtcaatcatgtcagaaaaaaaatcccagataagtat 

1 MVKQNRLDMVRDYQNAVNHVRKKIPDKY 

13806 aatcaaatagaattagttgatgaacttatgaatgatgatatagattattatatatctatttcaaaccgttctgatggaaaatcg 

29 KQIELVDELMNDDIDYYISISNRSDGKS 

13722 ttcaactatgtttcattttttatttatctagctattaaacttgatataaaatttactttattatcacgtcattatacattacgt 

57 FNYVSFFIYLAIKLDI KFTLLSRHYTLR 

13638 qacqcttaccgtgattttattgaagaaatcatagatgaaaatccactatttaaatcaaaacgtgtcacgttcagaagtgctagg 

as Bayrdfieeiidenplfkskrvtfrsar 

13554 gactatttagctattatctatcaagataaagaaattggtgtgattacagatttgaatagtgccactgatttaaaatatcattct 

113 DYLAIIYQDKEIGVITDLNSATDLKYHS 

13470 aactttttaaaacactatcctattattatatatgatgagtttttagcacttgaagatgattatttaattgatgagtgggataag 

141 nflkhypiiiydbflaleddylidewdk 

13386 ttaaaaacaatatatgaatcaatcgaccgtaaccatggtaacgttgattatattggattccctaaaatgtttttactaggtaat 

169 LKTIYESIDRNHGNVDY1GFPKMFLLGN 

13302 qcagtcaacttttcaagtcctatattatccaatttaaatatatacaatttattacaaaagcataaaatgaatacatcaagactt 

197 AVNFSS PILSNLN. IYNLLQKHKMNTSRL 

13218 tacaaaaacatttttttagaaatgcgacgaaacgattacgtgaatgaaaaacgtaacacacgtgcgtttaattcaaatgacgac 

225 YKNIFLEMRRNDYVNEKRMTRAFNSNDD 

13134 gctatgacaactggagaatttgaatttaacgaatataatttggcggatgataatttaagaaatcacaccaatcaaaacggtgat 

253 AMTTGEFEFNEYNLADDNLRNHINQNGD 

13050 ttcttctacatcaaaactgatgataaatatattaaagtcatgtataatgtaactacttttatgacaaatattatcgttgtacca 

281 PFYIKTDDKYIKVMYNVTTFMTNIIVVP 

12966 tatacaaaacaatatgaattttgtactaaaattagggatatagacaatcatgttacctatttacgtgatgatatgttttataaa 

309 YT KQYEFCTK I RD I DMHVT YLRDDM FYK 

12882 gaaaacatggaacgttattactacaatccaagcaatttacattttgacaatgcttactctaaaaattacgtggtt^ataatgat 

337 BNMERYYYNPSNLHFDNAYSKNYVVDND 

12798 aqatatttatatttagatatgaataaaattataaaatttcatataaaaaatgaaatgaagaaaaatatgagtgagtttgaaaga 

365 R YLYLDM NKIIKFH IKNEMKKNMSEFER 

12714 aaagaaaaaatatacgaagataactatatagagaatacgaaaaagtatctaatgaaacaatatggcttataa 12643 

393 KEK1YEDNYIENTKKYLMKQYGL* 
44AHJDORF006 
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803 atggcacaacaatctacaaaaaatgaaactgcacttttagtagcaaagtcagctaaaccagcgttacaagattttaatcatgat 

1 MAQQSTKNETALLVAKSAKSALQDFNHD 

887 tattcaaaatcttggacatttggcgacaaatgggataattcaaatacaatgttcgaaacatttgtaaataaatatttattccct 

29 YSKSWTFGDKWDNSNTMFET FVNKYLFP 

971 aagattaatgagactttattaatcgatattgcattaggtaatcgttttaactggttagctaaagagcaagattttattggacaa 

57 KINETLLIDIALGNRFNWLAKEQDFIGQ 

1055 tatagtgaagaatacgtgattatggacacagtaccaattaacatggacttatctaaaaatgaggaattaatgttgaaacgtaat 

85 YSEEYVIMDTVPINMDLSKNEE L M L K R N 

1139 tatccacgtatggcaactaagttatatggtaacggaattgtgaagaaacaaaaattcacattaaacaacaatgatacacgtttc 

113 YPRMATKLYGNGIVKKQKFTLNNNDTRF 

1223 aatttccaaacattagcagacgcaactaactacgctttaggtgtatacaaaaagaaaatttctgatattaatgtattagaagaa 

141 NPQTLADATNYALGVYKKKI SD I NVLEE 

1307 aaagaaatgcgtgcaatgttagttgattactcattgaatcaattatccgaaacaaatgtacgtaaagcaacatcaaaagaagat 

169 KEMRAMLVDYSLNQLSETNVRKATSKED 

1391 ttagcaagcaaagtttttgaagcaatcctaaacttacaaaacaacagtgctaaatataatgaagtacaccgtgcatcaggtggt 

197 LAS KVF EAI LNLQNNSAKYNEVH RASGG 

1475 gcaattggacaatatacaactgtatcaaaattaaaagatattgtgattttaacaacagattcattaaaatcttatcttttagat 

225 AIGQYTTVSKLKDIVILTTDSLKSYLLD 

1559 actaagattgcaaacacattccagattgcaggcattgatttcacagatcacgttattagttttgacgacttaggtggcgtgttt 

253 TKIANTFQIAGIDFTDHVISFDDLGGVF 

1643 aaagtaacaaaagaatttaagttacaaaaccaagattcaattgactttttacgtgcgtatggagattatcaatcacaattagga 

281 KVTKEFKLQNQDSIDFLRAYGDYQSQLG 

1727 gatacaattccagttggtgctgtatttacttatgatgtatctaaacttaaagagtttactggcaacgttgaagaaattaaacca 

309 DTI PVGAVFTYDVSKLKEFTGNVEEIKP 

1811 aaatcagatttatatgcgtttattttggatattaattcaattaaatataaacgttacacaaaaggtatgttaaaaccaccattc 

337 KSDLYAFILDINSIKYKRYTKGMLKPPF 

1895 cataaccctgaatttgatgaagttacacactggattcattactattcatttaaagccattagtccattctttaataaaatttta 

365 HNPEFDBVTHWIHY YS FKAI S P FFNKI L 

1979 attactgaccaagatgtaaatccaaaaccagaggaagaattacaagaataa 2029 

393 ITDQDVNPKPEEELQE* 
44AHJDORF007 

2044 atgaacaacgataaaagaggtttaaacgttgagttatcaaaggaaatcagcaaaagagttgttgaacatcgcaacagatttaaa 

1 M N.N DKRGLNVELSKE I SKRVVE HRNRFK 

2128 cgtcttatgtttaatcgttatttggaatttttaccgctactaatcaactataccaatcgtgatacggttggtatagattttatt 

29 RLMFNRYLEFLPLLINYTNRDTVGIDFI 

2212 cagttagaatcagctttaagacaaaacattaatgtagttgttggtgaagctagaaataagcaaattacgattcttggttatgta 

57 QLESALRQNINVVVGEARNKQI MILGYV 

2296 aataacacttactttaatcaagcaccaaatttttcatcaaactttaatttccaatttcaaaaacgactaactaaagaagatata 

85 NNTYFNQAPNFSSNFNFQFQKRLTKEDI 

2380 tattttattgtacctgactatttaatacctgatgattgtctacaaattcataagctatatgataactgtatgagtggtaacttt 

113 YFIVPDYL.I PDDCLQIHKLYDNCMSGNF 

2464 gttgtcatgcaaaataaaccaattcaatacaatagtgatatagaaattatagaacattatactgatgaattagcagaagttgct 

141 VVMQNKP IQYNSDI EI IBHYTDELAEVA 

2548 ttatctcgcttttctttaatcatgcaagcaaaatttagcaagatatttaaatcagaaattaatgacgagtcaatcaatcaactt 

169 LSRFSL1MQAKFSKIFKSEINDESINQL 

2632 gtgtccgaaatatataacggtgcaccatttgttaaaatgtcacctatgtttaatgcagatgacgatatcattgatttaacaagt 

197 VSEIYNGAPFVKMSPMFNADDDI IDLTS 

2716 aatagcgtaatcccagcattaactgaaatgaaacgggaatatcaaaacaaaattagtgaattaagtaactatttaggcattaat 

225 NSVIPALTEMKREYQNKISELSNYLGIN 

2800 tcattagccgttgataaagaaagcggtgtttcagacgaagaggcaaaaagtaatcgtggatttaccacatcaaacagtaatatc 

253 SLAVDKESGVSDEEAKSNRGFTTSNSNI 

2884 tatttaaaaggtcgtgaaccaattacgtttttatcaaagcgttatggtttagatattaaaccgtattacgatgatgaaacaacg 

281 YLKGREPITF LSKRYGLDI KPYYDDETT 

2968 tctaaaatatcaatggtagacacactttttaaagatgaaagcagtgatataaatggctag 3027 

309 SKI SMVDTLFKDESSDING* 
44AHJDORF008 

3020 atggctagatacacaatgactttatacgatttcattaaatcagaattgattaaaaaaggtttcaatgaatttgtaaatgataat 

1 MARYTMTLYDFIKSELIKKGFNEFVNDN 

3104 aaatcaacgttttatgatgatgaatttcaattcatgcaaaaaatgctgaagttcgacaaagacgttttagctatcgttaatgaa 

29 KLTFYDDEFQFMQKMLKFDKDVLAIVNE 

3188 aaagtatttaaaggtttttcattgaaagatgaattatcagatttactttttaaaaaatcatttacgattcattttttagataga 

57 KVFKGFS LKDELSDLLFKKS FT I HFLDR 

3272 gaaatcaacagacaaacagttgaagcatttggcatgcaagtgattactgtatgtattacacatgaggattatttaaatgtggtt 

85 EINRQTVEAFGMQVITVCITHEDYLNVV 

3356 tattcatcaagtgaagttgaaaaatacttacaatcacaaggcttcacagaacacaatgaagatacaacaagtaacactgatgaa 

113 YSSSEVEKYLQSQGFTEHNEDTTSNTDE 

3440 acatcgaatcaaaatgctacatctttagacaattcaactggcatgactgcaaacagaaacgcttatgtgtcattaccacaaagt 

141 TSNQNATSLDNSTGMTANRNAYVSLPQS 

3524 gaggttaacattgatgttgataatacaacgttacgattcgctgataataatacgattgataacggtaaaactgtgaataaatcg.. 

169 EVNIDVDNTTLRFADNNTIDNGKfV N-^K~*S 

3608 agtaacgaaagtaatcaaaacgcaaaacgtaatcaaaatcaaaaaggtaatgcaaaaggtacacaattcactaagcagtattta 

197 SNESNQNAKRNQNQKGNAKGTQFTKQYL 

3692 attgataatattgataaagcgtacgatttaagaaagaaaattttaaatgaatttgataaaaaatgttttttacaaatctggtag 
3775 

225 IDNIDKAYDLRKKILNEFDKKCFLQIW* 
44AHJDORF009 
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5744 atgaaatcacaacaacaagcaaaagaatggatatataagcatgagggggcaggtgttgactttgatggtgcatatggatttcaa 
1 MKSQQQAKEW IYKHEGAGVDFDGAYGFQ 

5828 tgtatggacttatcagttgcttacgtgtattacattactgacggtaaagttcgcatgtggggtaatgctaaagacgcgataaat 

29 CMDLSVAYVYYITDGKVRMWGNAKDAIN 

5912 aatgactttaaaggtttagcgacggtgtataaaaatacaccgagctttaaacctcaattaggggacgttgctgtatatacaaat 

57 ND FKGLATVYKNTPS FKPQLGDVAVYTN 

5996 ggacaatatggacatattcaatgtgtgttaagtggaaatcttgattattatacatgcttagaacaaaactggttaggcggcggt 

85 GQYGHIQCVLSGNLDYYTCLEQNWLGGG 

6080 Cttgacggttgggaaaaagcaaccattagaacacattattatgacggtgtaactcactttattagacctaaattttcaggtagt 
113 FDGWEKATIRTHYYDGVTHFIRPKFSGS 

6164 aacagcaaagcattagaaacaccaaaagtaaatacatttggaaaatggaaacgaaaccaatacggcacatattatagaaatgaa 

141 NSKALETSKVNTFGKWKRNQYGTYYRNE 

6248 aatggtacattcacatgtggttttttaccaatatttgcacgtgtcggtagtccaaaatcaCcagaacctaatggctattggttc 

169 NGTFTCGFLP1FARVGSPKLS EPNGYWF 

6332 caaccaaacggttatacaccatataacgaagtttgtttatcagatggttacgtatggattggttataactggcaaggcacacgt 

197 QPNGYTPYNEVCLSDGYVWI GYNWQGTR 

6416 tattatttaccagtgcgccaatggaatggaaaaacaggtaatagttacagtgttggtattccttggggggtgttctcataa 6496 

225 YYLPVRQ WNGKTGNSYSVG I PWGVFS * 
44AHJDORF010 

14420 ttggttagacatacgtctgaaatggatagatggaaaaaagaaagagaagctagaaaagagcaagaaaaagatttatttttaaat 

1 LVRHTSEMDRWKKEREARKEQEKDLFLN 

14336 gattttagtaatgttaattttaaatttgatgataaagatttacaagaggcgtacattgacacatggaaacattttgcacatctg 

29 DFSNVNFKFDDKDLQEAY IDTWKHFAHL 

14252 ccctattttcctaaagaaagaaacgtatcatatgtaaatgctgtatcattggtaagaggttcaagacataaaaaattaaattat 

57 PYF PKERN VSYVNAVS LVRG SRHKKLNY 

14168 attcttgaaatatataaccgtaatgatgattctaataataaaaacgctaaaaagcataaatacgctttatataatttacaagct 

85 ILEIYNRNDDSNNKNAKKHKYALYNLQA 

14084 aaaaataataattcttcaatgtataaatatattaaagaaatcgatactttatataaagaaattggtaaatcagatagaccagtg 

113 KNNNSSMYKYIKEIDTLYKE IGKSDRPV 

14000 acaaatattgatgatgaagatgtgaggtataactttttatattatgcaacatttgacgaataa 13938 

141 TNI DDEDVRYNFLYYATFDE * 
44AHJDORF011 

15593 atgacaaacgtaaaagatattttatcaagacaccaaaacacattagcgagatttgaatttgaggaaaaagaaagagaatttatc 

1 MTNVKDILSRHQNTLARFEFEEKERBFI 

15509 aaactatcagaattagtagaaaaatacggtatgaaaaaagagtatatcgttagagcattattcacaaacaaagaatcaaaattc 

29 KLSELVEKYGMKKEY I VRALFTNKESKF 

15425 ggtgaacaaggtgttatcgtcactgatgactataacgtaaacttaccgaaccacttaacagaattaattaaagaaatgagagca 

57 GEQGVIVTDDYNVNLPNHLTELIKEMRA 

15341 gatgaggacgttgttgacattatcaatgctggagaagttcaattcacaatttatgaatatgaaaacaaaaaaggtcaaaaaggt 

85 DEDVVDI INAGEVQFT IYEYENKKGQKG 

15257 tactcaatcaattttggtcaagtatcattttaa 15225 

113 YSINFGQVSF* 
44AHJDORF012 

8391 atgaacgaagtaaaattcagatctacagactcagaagcgtttcacatgtttatatacgctggggatttaaaattactctacttt 

1 MNEVKFRFTDSEAFHMF1YAGDLKLLYF 

8475 ttatttgtattaatgttcgttgatactattacaggtatttcaaaagcaattaaaaataataacttatggtcaaaaaaatcaatg 

29 LFVLMFVDI ITGISKAIKNNNLWSKKSM 

8559 agaggattttctaaaaaattattgatattctgtattatcattctagcaaacatcattgaccagattttacaattaaaaggtggt 

57 RGFSKKLLIFCIIILANI IDQILQLKGG 

8643 ctactcatgattacaatattttattatattgcaaatgagggactttctattgtagaaaatcgtgcagaaatggacgtattagta 

85 LLMITIFYYIANEGLS IVENCAEMDVLV 

8727 ccagaacaaattaaagataaattaagagtcattaaaaatgatactgaaaagagtgataacaatgaacgatcaagagaagataga 

113 P E Q I KDKLRV I KNDTEKSDNNERSREDR 

8811 taa 8813 

141 * 
44AHJDORF013 

14996 atgaaaattaaaactacttttagattaaataatttaatttattaccttttaacaaatagagattattataatgataaatttgaa 

1 MKIKTTFRLNNLIYYLLTNRDYYNDKFE 

14912 aaatttacttcatctaataaaaaatgtatagtaaaaataaatatgggtgatgtgtatattgagtttgacaaacaatatgatgat 

29 KFTSSNKKCIVKINMGDVYI EFDKQYDD 

14828 tttgaaattgaaaaagagttatttacgttagatatcgacattgatattaaaaaacatgtttttaatatacttgtattttattat 

57 FEI EKELFTLDIDID IKKHVFNI LVFYY 

14744 agaaattatttaagtaatgaattaataagagaaattttattaaacgttacaattgacgacgtattatcaaattttgataaacct 

85 RNYLSNEL IREILLNVTIDDVLiSNFDKP 

14660 cttgaaagcgaattaatgattatttatcaaaacaaagtcatatacgaCaatgggaaagtgattgaccatgaataa 14586 

113 LESELNI IYQNKVIYDNGKVIDHE* 
44AHJDORF113 

199 atgacagaatttgatgaaatcgtaaaaccagacgacaaagaagaaacttcagaatcaactgaagaaaatttagaatcaactgaa 

1 MTEFDEIVKPDDKEETSESTEE'NLE5TE rr 

283 gaaacttcagaatcaactgaagaatcaactgaagaatcaactgaagaatcaactgaagataaaacagtagaaacaatcgSagaa 

29 ETSESTEESTEESTEESTEDKTVET— >I E E 

367 gaaaatgaaaacaaattagaacctactacaacagatgaagatagttcgaaatttgaccctgttgtattagaacaacgtattgct 

57 ENENKLEPTTTDEDSSKFDPVVLEQRIA 

451 tcactagaacaacaagtgactacttttttatcttcacaaatgcaacaaccacaacaagtacaacaaacacaatcagatgtaaca 

85 SLEQQVTTFLSSQMQQPQQVQQTQSDVT 

535 gaatcaaacaaagaagataacgactattcagatgaagaactagttgataagttagatttagattag 600 
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113 ESNKEDNDYSDEELV -DKLDLD* 

44AHJDORF114 

16172 atggttaatgttgataatgcaccagaagaaaaaggacaagcctatactgaaatgttgcaactattcaataaactgattcaatgg 
1 MVNVDNAPEEKGQAYTEMLQLFNKLIQW 
16088 aatccagcttatacatttgacaatgcaattaacttattatcggcttgccaacaactattattaaactataacagttctgttgtt 
29 NPAYTFDNAINLLSACQQLLLNYNSSVV 
16004 caattcttaaatgatgaactaaacaacgaaactaaaccagaatcaatattgtcttatattgctggcgatgacccaatagaacaa 

57 QFLNDELNNETKPES I LSYIAGDDPI E Q 

15920 tggaatatgcataaaggattttatgaaacgtataacgtttacgttttttag 15870 
85 WNMHKGFYETYNVYVF * 

44AHJDORF014 

6243 atgaaaatggtacatttacatgtggttttttaccaatatttgcacgtgtcggtagtccaaaattatcagaacctaatggctatt 
1 NKMVH LHVV FYQY LHVSVVQNYQNLMAI 

6327 ggttccaaccaaacggttatacaccatataacgaagtttgtttatcagatggttacgtatggattggctataactggcaaggca 
29 GSNQTVIHH ITKFVYQMVTYGLVITGKA 

6411 cacgttattatttaccagtgcgccaatggaatggaaaaacaggtaatagttacagtgttggtattccttggggggtgttctcat 
57 HVI XYQCANGMEKQVX VTVLVFLGGCSH 

6495 aatgggtattttagcctttttctttga 6521 
85 MGYFSLFL* 
44AHJDORF015 

15403 gtgacgataacaccttgttcaccgaattttgattctttgtttgtgaataatgctctaacgatatactcttttttcataccgtat 
1 VTITPCSPNFDSLFVNNALTIYSFFIPY 
15487 ttttctactaattctgatagtttgataaattctctttctttttcctcaaattcaaatctcgctaatgtgttttggtgtcttgat 

29 FSTNSDSLINSLSFSSNSNLANVFWCLD 
15571 aaaatatcttttacgtttgtcattttatttctcctcttatttaaattatttgctttctgcaattgcgatttgtag 15645 
57 KISFTFVILFLLLFKLFAFCNCDL* 
44AHJDORF016 

15852 atgaaagttgacgacattgttaccttacgtgtcaaaggttatatacttcattacttagatgatgataatgaatacattgaggaa 
1 MKVDD IVTLRVKGYI LHYLDDDNEY I EE 

15768 tttttaccacttcacgagtatcatttaaccaaaacacaagcaaaagaattattaccagacacatgtaaactattgtccactaca 
29 FLPLHEYHLTKTQAKELLPDTCKLLSTT 
15684 cgcacaacgaaaacaattcaagtttattacaatgatttactacaaatcgcaattgcagaaagcaaataa 15616 

57 RTTKTIQVYYNDLLQIAIAESK* 
44AHJDORF017 

10757 atggaaagattaaaattgcttctgctggtacaccgaaaaacgcctttgatacaagcgtcgattttgaaacctttgtacgtgaac 

1 MERLKLLLLVYRKTPLIQASI LKPLYVN 

10673 aattcttcgacggtgccattattgaaaacaataaaagtatctataatgagcaaggtacaatatcgatatatccgtctaaaactg 

29 N S L T V P LLKTZ K-VS I MS KVQYRY I RLKL 

10589 aaattgtatgtggtaatgtatatgatgaatattttactgatgaacttaatatga 10536 

57 KliYVVMYMMNILLMNLI * 
44AHJDORF018 

1098 atgttaattggtactgtgtccataatcacgtattcttcactatattgtccaataaaatcttgctctttagctaaccaattaaaa 

1 MLIGTVSIITYSSLYCPIKSCSLANQLK 

1014 cgattacctaatgcaatatcgattaataaagtctcattaatcttagggaataaatatttatttacaaatgtttcgaacattgta 

29 RLPNAISINKVSLILGNKYLFTNVSNIV 

930 tttgaattatcccatttgtcgccaaatgtccaagattttgaataa 886 

57 PELSHLSPNVQDFE* 
44AHJDORF019 

9836 atgttacctggtttgtataagtattcttttttgaataaaggtacaccaattgcttttttatatttttctggtaactgtgcatat 

1 MLPGLYKYSFLNKGTPIAFLYFSGNCAY 

9752 gtccagttaccaccaatcacacgaccactttttccatttggcttgactgatttaccactaattggtttatggtctccgtcatca 

29 VQLPPITRPLFPFGLTOLPLIGLWSPSS 

9668 tcagtaggattagaactactactcccactatctacttga 9630 

57 SVGLELLLPLST* 

44AHJDORF121 

16362 atggaaaatgaaacaaaaaacattgagttgaagcatgtttttcgttttaagaatggaagtttatgtatagcgttatttgataga 

1 MENETKNIELKHVFRFKNGSLCIALFDR 

16278 acagaaaatgaaatttcattttatgatgttgacattgacgaaattgaagatttaaatcataattctgttttacgcgtaatttca 

29 TBNEISFYDVDIDE1 EDLNHNSVLRVIS 

16194 actttattaggaagtgataataatggttaa 16165 

57 TLLGSDNNG* 
44AHJDORF020 

13865 atgtctaaacgattttgttttaccatgtttttgctccttgtaatagtttatgatgtcgtttacagtgttaaatttattcgtcaa 

1 MSKRFCFTMFLLLVIVYDVVYSVKFIRQ 

13949 atgttgcataatataaaaagt tatacctcacatctt catcat caatat ttgt cactggtctatctgattt accaat 1 1 ct ttat 

29 MLHNIKSYTSHLHHQYLSLVYLIYQFLY 

14033 ataaagtatcgatttctttaa 14053 

57 IKYRFL* . . 

44AHJDORF123 " _lr *" 

614 atgtatgagggaaacaacatgcgttctatgatgggtacatcatatgaagattcaagattaaataaacgaacagaattaaatgaa 

1 MYEGNNMRSMMGTSYEDSRLNKRTELNE 

698 aacatgtcaattgatacaaataaaagtgaagatagttatggtgtacaaattcattcactttcaaaacaatcatttacaggtgac 

29 NMSIDTNKSEDSYGVQIHSLSKQSFTGD 

782 gttgaggaggaataa 796 

57 V E E E * 
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44AHJDORF021 

5816 atgcaccaccaaagtcaacacctgccccctcacgcttatacatccattcttttgcctgttgttgtgatttcatttatatcactc 
1 MHHQSQHLPPHAYI S ILLLVVVI S F I SL 

5732 ctatttttgatgttttgctacccaaccatattcacgatgttttgtttccgcattc^cattactgaagaattctttatattccga 
29 LFLMFCYPTIFTMFCFRINITEEPFIFR 
564 8 tatattagcctctaa 5634 
57 Y I S L * 

44AHJDORF022 

8611 atgtttgctaaaatgataatacagaatatcaataattttttagaaaatcctctcattgatttttttgaccataagttattattt 
1 M F A K M I IQN INNFLENPLI DFFDHKLLF 

8527 ttaattgcttttgaaatacctgtaataatatcaacgaacattaatacaaataaaaagtag 8468 
29 LIAFEIPVIISTNINTNKK* 
44AHJDORF023 

6494 atgagaacaccccccaaggaataccaacactgtaactattacctgtttttccattccattggcgcactggtaaataataacgtg 
1 MRTPPKEYQHCNYYLFFHS IGALVNNNV 

6410 tgccttgccagttataaccaatccatacgtaaccatctgataaacaaacttcgttatatggtgtataaccgtttggttggaacc 
29 CLASYNQS IRNHLINKLRYMVYNRLVGT 

6326 aatagccattag 6315 
57 N S H * 

44AHJDORF024 

14275 gtgtcaatgtacgcctcttgtaaatctttatcaccaaatttaaaattaacattactaaaatcatttaaaaataaatctttttct 
1 VSMYASCKSIiSSMLKLTLLKSFKMKSF S 

14359 tgctcttttctagcttctctttcttttttccatctatccatttcagacgtatgtctaaccaatgttatcaacctccatataaag 
29 CSFLASLSFFHLSISDVCLTNVINLHIK 
14443 cataaataa 14451 
57 H K * 

44AHJDORF025 

15175 atggaacgtaaatacaaaacggtattattatattgcgatgagattaaaggacattttccacatcaaatctcaatgtttgaagat 
1 MERKYKTVLLYCDEI KGHF PHQ I SMFED 

15091 ttatatgacgctaaagttgtatattcatattatgaatataacctgttcactaaaaaatacgcgtatatcatagaatacattaag 
29 LYDAKVVYSYYEYNLFTKKY AYI IEYIK 

15007 gagatataa 14999 
57 EI* 
44AHJDORF026 

14593 atgaataacctattaaacatagccattgttttccttttagcatttttaattacacttatcatacttatgacactgcatatacgc 
1 MNNLLNIAIVFLLAFLITLI ILMTLHIR 

14509 gcgtcatttggtgttttattcactacattgattatattctatattatctttttaatggttatttatgctttatatggaggttga 
14426 

29 VSFGVLFTTLIIFYI IFLMVIYALYGG* 

44AHJDORF027 

12916 atgattgtctatatccctaattttagtacaaaattcatattgttttgtatatggtacaacgataatatttgtcataaaagtagt 
1 MIVYIPNFSTKFILFCIWYNDNICHKSS 
13000 tacattatacatgactttaatatatttatcatcagttttgatatagaagaaatcaccgttttgattgatgtgatttcttaa 
13080 

29 YIIHDFNIFIISFDIEEITVLIDVIS* 
44AHJDORF029 

15183 gtgtttaaatggaacgtaaatacaaaacggtattattatattgcgatgagattaaaggacattttccacatcaaatctcaatgt 
1 VFKWNVNTKRYYYIAMRLKDI FHIKSQC 

15099 ttgaagatttatatgacgctaaagttgtatattcatattatgaatataacctgttcactaaaaaatacgcgtatatcatag 
15019 

29 LKIYMTLKLYIHIMNITCSLKNTRIS* 



44AHJDORF028 

9235 atggaatatatgcacgtccaattgtacctgctttcatattttttgcaaaatctgcattaccttttctttgtacgtcttgtggta 
1 MEYMHVQLYLLSYFLQNLHYLFFVRLVV 
9151 caaagtggacgatgttacctgcgtcataccaagacggttgtccagcttgttttgattgtgatactaactttcttgctatga 9071 
29 QSGRCYLRHTKTVVQLVLIVILTFLL* 
44AHJDORF030 

14487 gtgaataaaacaccaaatgacacgcgtatatgcagtgtcataagtatgataagtgtaattaaaaatgctaaaaggaaaacaatg 
1 VNKTPNDTRICSVI S M I SVI KNAKRKTM 

14571 gctatgtttaataggttattcatggtcaatcactttcccattatcgtatatgactttgttttgataaataatcattaa 14648 
29 AMFNRLFMVNHFPI IVYDFVLINNH* 

44AKJDORF031 

11039 atgatattgcatagttcattgttatcatctaaacggaataagttaaaatgtgaacgtaatgcaggtatgccatataatccattt 
1 MILYSSLLSSKRNKLKCERNAGMPYNPF 
11123 aaaacgactttagataacataacctcctcatttgagtatgggtgttcgttgatatcatcagtaatgtga 11191 
29 KTTLDNITSSFEYGCSLISSVM* 

44AHJDORF135 ...... 

693 atgaaaacatgtcaattgatacaaataaaagtgaagatagttatggtgtacaaattcattcactttcaaaacaatQfittCacag 
1 MKTCQLIQI KVKI VMVYKFIHFQN M-> H L Q 

777 gtgacgttgaggaggaacaataaattatggcacaacaatctacaaaaaatgaaactgcacttttag 842 
29 VTLRRNNKLWHNNLQKMKLHF* 
44AHJDORF033 

3795 atgccattatttaaccacctctaccaaatttgtaaaaaacattttttatcaaattcatttaaaattttctttcttaaatcgtac 
1 MPLFNHLYQICKKHFLSNSFKI FF LKSY 
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3711 gctttatcaatattatcaattaaatactgcttagtgaattgtgtaccttttgcattacctttttga 3646 
29 ALSILSIKYCLVNCVPFALPF * 

44AHJDORF032 

9455 atggcttgttttgctaaagcgagtagtgaactaccactgtcaccactactaccactgtcagacgaatcactaggtgatccacct 
1 MAC FAKASSELPLS PLLPLSD ESLG D PP 

9371 ttaccgtctaatttaccaccccaagctagaatagtattcgcaccgtctaaaaatggattaccatag 9306 
29 LPSNLPPQARIVFAPSKNGLP* 
44AHJDORF034 

14146 atgatgattctaataataaaaacgctaaaaagcataaatacgctttatataatttacaagctaaaaataataattcttcaatgt 
1 MMILIIKTLKSINTLYI IYKLKIIILQC 

14062 ataaatatattaaagaaatcgatactttatataaagaaattggtaaatcagatagaccagtga 14000 
29 INI LKKSILYIKKLVNQIDQ* 

44AHJDORF035 

13957 atgcaacatttgacgaataaatttaacactgtaaacgacatcataaactattacaaggagcaaaaacatggtaaaacaaaatcg 
1 MQHLTNKFNTVNDI INYYKEQKHGKTKS 

13873 tttagacatggtaagagattatcaaaatgctgtcaatcatgtcagaaaaaaaatcccagataa 13811 
29 FRHGKRLSKCCQSCQKKNPR* 
44AHJDORF036 

10165 gtgtatacaataccacacgtgatggtgcaacatatggtggtacattacagtttgcaactaaaaacgaaccatcttcaaaaactg 
1 VYT I PHVMVQHMVVHYSLQLKTNHLQKL 

10081 ctacaacaacacctgtgtgaccaataccatatgcagttgcttgtaagtatggtggtttactag 10019 
29 LQQHLCDQYHMQLLVSMVVY* 
44AHJDORF037 

14788 atgtcgatatctaacgtaaataactctttttcaatttcaaaatcatcatattgtttgtcaaactcaatatacacatcacccata 
1 MSISNVNNSFSISKSSYCLSMSIYTSPI 
14872 tttatttttactatacattttttattagatgaagtaaatttttcaaatttatcattataa 14931 
29 FIFTIHFLLDEVNFSNLSL* 
44AHJDORF038 

3671 gtgtaccttttgcattacctttttgattttgattacgttttgcgttttgattactttcgttactcgatttattcacagttttac 
1 VYLLHYLFDFDYVLRFDYFRYS IYSQFY 

3587 cgttatcaatcgtattattatcagcgaatcgtaacgttgtattatcaacatcaatgttaa 3528 
29 RYQSYYYQR IVTLYYQHQC * 

44AHJDORF039 

1743 gtgctgtatttacttatgatgtatctaaacttaaagagtttactggcaacgttgaagaaattaaaccaaaatcagatttatatg 
1 VLYLLMMYLNLKSLLATLKKLNQNQIYM 
1827 cgtttattttggatattaattcaattaaatataaacgttacacaaaaggtatgttaa 1883 
29 RLFWILIQLNINVTQKVC* 
44AHJDORF040 

9740 gtggtaactggacatatgcacagttaccagaaaaatataaaaaagcaattggtgtacctttattcaaaaaagaatacttataca 
1 VVTGHMHSYQKNI KKQLVYLYSKKNTYT 

9824 aaccaggtaacatatttcctcaaacgggtaatgcaggacaatgtacagaattaa 9877 
29 NQVTYFLKRVMQDNVQN* 
44AHJDORF041 

15836 atgtcgtcaactttcattattatatcactcctttctaaaaaacgtaaacgttatacgtttcataaaatcctttatgcatattcc 
1 MSSTFII IS LLSKKRKRYTFHKI LYAYS 

15920 attgttctattgggtcatcaccagcaatataagacaatattgattctggtttag 15973 
29 IVLLGHHQQYKTILILV* 

44AHJDORF042 

5151 atgcacgaccgtcgtcttttgttaatttatagttttgtgaacctcttgcgcgtaatgcttcaaagtgt teat act caccaagtt 
1 MHDRRLLLI YSFVNLLRVMLQSVHTHQV 

5067 ggaagaaaccatataaattatggaaacgttttccaccaccgccgtttgtcatag 5014 
29 GRNHINYGNV FHHRRLS* 

44AHJDORF043 

4539 atgcgacttgtaacagttttgcaacaccatcgtgatgtaaccagattttcatttcaccattggattgacgttctaatccgattg 
1 MRLVTVLQHHRDVTRFSFHHWIDVLIRL 
4455 ttgtaccatgaccaccctgtacaatacgcatgcttgaaattaagtcaccactag 4402 
29 LYHDHPVQYACLKLSHH* 
44AHJDORF044 

12917 atgttacctatttacgtgatgatatgttttataaagaaaacatggaacgttattactacaatccaagcaatttacattttgaca 
1 MLPIYVMICFIKKTWNVITTIQAIYILT 
12833 atgcttactctaaaaattacgtggttgataatgatagatatttatatttag 12783 

29 MLTLKITWLIMIDI YI * 

44AHJDORF149 

770 atgattgttttgaaagtgaatgaatttgtacaccataactatcttcacttttatttgtatcaattgacatgttttcatttaatt 
1 MIVIiKVNEFVHHNYLHFYLYQLTCFHLI 
686 ctgttcgtttatttaatcttgaatcttcatatgatgtacccatcatag 639 
29 LFVYLILNLHMMYPS* 
44AHJDORF046 

4891 atgattatccatttaagttatcatatcaagacggtattaatttcccacgtgataactttaaagagcctgagggtattxgc?a'ttt 
1 MIIHLSYHI KTVLISHVITLKSLR V-» F A F 

4 975 atacaaatccaaaaacaaaacgtaaatcgttattacttgctatga 5019 
29 IQIQKQNVNRYYLL* 
44AHJDORF047 

11911 atgaatgtatgtaagttgttcaggtgtgagttttgcaaaacatttcacagcatagtcataggcttcactatcattcatatcatt 
1 MNVCKLFRCEFCKTFHSIVIGFTIIHI I 
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11995 atctttatcaaaaatcgtataattaaaatctgttttaagttgtga 12039 
29 IFIKNRIIKICFKL* 
44AHJDORF04 5 

10655 atggcaccgtcaaagaattgttcacgtacaaaggtttcaaaaccgacgcttgtatcaaaggcgtttttcggtataccagcagaa 
1 MAPSKNCSRTKVSKSTLVS KAFFG I PAE 

10739 gcaattttaatctttccattcacttcatatgcatatttcttatga 10783 
29 AILI FPFTSYAYFL* 

44AHJDORF048 

15340 atgaggacgttgttgacattatcaatgctggagaagttcaattcacaatttatgaatatgaaaacaaaaaaggtcaaaaaggtt 
1 MRTLLTLSMLBKFNSQFMNMKTKKVKKV 
15256 actcaatcaattttggtcaagtatcattttaatacaatttcatag 15212 
29 TQS ILVKYHFNTIS* 

44AHJDORF049 

5784 atgagggggcaggtgttgactttgatggtgcatatggatttcaatgtatggacttatcagttgcttatgtgtattacattaccg 
1 MRGQVLTLMVHMDFNVWTYQLLMCITLL 
5868 acggtaaagttcgcatgtggggt aatgctaaagacgcgacaa 5909 
29 TVKFACGVMLKTR* 
44AHJDORF050 

13158 gtgtgttacgtttttcattcacgtaatcgtttcgtcgcatttctaaaaaaatgtttttgtaaagtcttgatgtattcattttat 
1 VCYVFHSRNRFVAFLKKCFCKVLMYSF Y 

13242 gcttttgtaataaattgtatatatttaaattggataatatag 13283 
29 AFVINCIYLNWII* 
44AHJDORF0S1 

11066 atgataacaatgaactatacaatatcattaacggttacaaaaacactgaacgtaatatattattctctacatttgtcacatcac 
1 mitMNYTISLTVTKTLNVIYYSLHLSHH 
10982 gttcattgtataacttattggttcctttccaatacttaa 10944 
29 VHC ITYWFLSNT* 

44AHJDORF052 

14338 atgattttagtaatgttaattttaaatttgatgataaagatttacaagaggcgtacattgacacatggaaacattttgcacatc 
1 MILVMLI LHLMI KIYKRRTLTHGNI LHI 

14254 tgccctattttcctaaagaaagaaacgtatcatatgtaa 14216 
29 CPI FLKKETYHM* 

44AHJDORF053 

3348 atgtggtttattcatcaagtgaagttgaaaaatacttacaatcacaaggcttcacagaacacaatgaagatacaacaagtaaca 
1 MWFIHQVKLKNTYNHKASQNTMKIQQVT 
3432 ctgatgaaacatcgaatcaaaatgctacatctttag 3467 
29 LMKHRIKMLHL* 
44AHJDORF054 

7551 atgactggaatggaaatacgatgttactcgacgctggtaagatttcacaaaaaactggtgttaagttacgtacaaaatcaatta 
1 MTGMEIRCYSTLVRFHKKLVLSYVQMQL 
7635 ttggttatcataatgaagttcgagtatatccagtag 7670 
29 LVI IMKFEYIQ* 

44AHJDORF055 

15705 atgtgtctggtaataattcttttgcttgtgttttggttaaatgatactcgtgaagtggtaaaaattcctcaatgtattcattat 
1 MCLVIILLLVFWLNDTREVVKI PQCIHY 

15789 catcatctaagtaatgaagtatataacctttga 15821 
29 HHLSNEVYNL* 
44AHJDORF056 

5512 gtgagtattacattacaggtaaccaaatggaattatttagagacgcgccagaagaaattaaaaaagtgggtgcatggttacgtg 
1 VSITLQVTKWNYLETRQKKLKKWVHGYV 
5596 tgtcaagtggtaacgcagtcggtgaagtaa 5625 
29 CQVVTQSVK* 
44AHJDORF057 

10121 atgtaccaccatatgttgcaccatcacgtgtggtattgtatacactcattaatggcgtaccaaataatgctggtgataatattg 
1 MYHHMLHHHVWYCIHSLMAYQIMLVII L 

10205 tattctttagtggtattgcttaattaa 10231 
29 YSLVVLLN* 
44AHJDORF058 

10767 atgcatatttcttatgattcagtacaaacatcttatctatctgttcgttttcaatatcccatttacctaaggctatcgggtcga 
1 MHISYDSVQTSYLSVRFQYPIYLRLSGR 
10851 ataaactggggttcaataagggtttaa 10877 
29 INWGSIRV* 
44AHJDORF164 

702 atgttttcatttaattctgttcgtttatttaatcttgaatcttcatatgatgcacccatcatagaacgcatgttgtttccctca 
1 MFSFNSVRLFNLESSYDVPI IERMLFPS 

618 tacatgtttaaattcctcctaatctaa 592 
29 YMFKFLLI* 

44AHJDORF059 •- 

8360 atggattttgtaacattggattacctgaaccgtcattatgccaaaatcttacaccagattctaaaattg"cttttaatt§ttcca 

1 MDFVTLDYLNRHYAKI L H Q I LKLL*IVP 

8276 ttaacatggggtcgatgtcacgtatag 8250 

29 LTWGRCHV * 

44AHJDORF060 

6257 atgtaccattttcatttctacaatatgtgccgtattggtttcgtttccattttccaaatgtatttacttttgatgtttctaatg 
1 MYHFHFY NMCRIGFVSI FQMYLLLMFLM 
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6173 ctttgctattactacctgaaaatttag 6147 
29 LCYYYLKI* 
44AHJDORF061 

15551 atgtgttttggtgtcttgataaaatatcttttacgcttgtcattttatttctcctctcatttaaattatttgctttctgcaatt 
1 MCFGVLIKYLLRLS fypssylnyllsai 

15635 gcgatttgtagtaaatcactgtaa 15658 
29 AICSKSb* 
44AHJDORF062 

4 2 85 gtggtattcgcaacgcagttaaccaatctattaatattgataaagaaacaaatcacatgtactctacacaatccgattctcaaa 
1 VVFATQLTN LLI LI KKQITCTLHNP I LK 

4369 aacctgaaggtttttggataa 4389 
29 NLKVFG* 
44AHJDORF063 

9487 atgcgtcttgtattttttttaataattcccgcatggcttgttttgctaaagcgagtagtgaactaccactgtcaccactactac 
1 MRLVFFLI I LAWLVLLKRVVNYHCHHYY 

9403 cactgtcagacgaatcactag 9383 
29 HCQTNH* 
44AHJDORF065 

5029 gtggtggaaaacgtttccataatttatatggtttcttccaacttggtgagtatgaacactttgaagcattacgcgcaagaggtt 
1 VVENVSI IYMVSSNLVSMNTLKHYAQEV 

5113 cacaaaactataaattaa 5130 
29 H K T I N * 

44AHJDORF064 

2609 atgacgagtcaatcaatcaacttgtgtccgaaatatataacggtgcaccatttgttaaaatgtcacctatgtttaatgcagatg 
1 MTSQS INLCPKYITVHHLLKCHLCLMQM 

2693 acgatatcattgatttaa 2710 
29 T I S L I * 

44AHJDORF066 

10481 atgatattctttatattgaaagtgacatcggttcattttcacttaacgacttatttccagttgaacgttcagtacataacaaat 
1 MIFFILKVTSVHFHLTTYFOLNVQYITN 
10397 ctgatttgcatatattaa 103B0 
29 h I C I Y * 
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Table 19 



Sequence similarities between ORFs 44AHJD and public databases 



Phage: 44AHJD 
Database: nr 

Query= aid | 110871 | lan |44AHJDORF001 Phage 44AHJD ORF| 10342-12627 | -1 
(761 letters) 



gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 



118848 |sp|P19894|DPOL_BPM2 DNA POLYMERASE >gi | 76896 |pir| | JQO . 
1072656|pir| |S51275 DNA polymerase - phage CP-1 >gi| 836593 |e. 
1429230 |emb|CAA67649| (X99260) DNA polymerase [Bacteriophage. 
1572479 |emb|CAA65712| (X96987) DNA polymerase [Bacteriophage. 
118851 |sp|P06950|DPOL_BPPZA DNA POLYMERASE (EARLY PROTEIN GP. 
2435429 (AF012250) unassigned reading frame (possible DNA po. 
1084487|pir| |S41618 DNA polymerase - slime mold (Phyearum po. 
4877819|gb|AAD31446.l| (AF133S05) DNA polymerase [Neurospora. 
461962 |sp|P33537|DPOM_NEUCR PROBABLE DNA POLYMERASE >gi|2833. 
2499511 |sp|Q12471 1 6P22_YEAST 6 - PHOSPHOFRUCTO - 2 - KINASE 2 (PHO. 
22SB375|gb|AAD11909.l| (AF007261) transcription initiation f. 
15734|emb|CAA37450| (X53370) DNA polymerase (AA 1-575) [Bact. 



Query= sid| 110872 | lan | 4 4AHJDORF002 Phage 44AHJD ORF| 3789-5732 | 3 
(647 letters) 

gi| 135273 |sp|P27622|TAGCJ3ACSU TEICHOIC ACID BIOSYNTHESIS PROTE. . 

gi | 142847 (M64050) DNase inhibitor (Bacillus subtilis] 

gi 1 4038407 (AF103943) factor C protein precursor (Streptorayces .. 

Query* sid| 110873 | lan (44AHJDORF003 Phage 44AHJD ORF| 6626-8389 | 2 
(587 letters) 

gi| 138123 |sp|P0433l|VG9_BPPH2 TAIL PROTEIN (LATE PROTEIN GP9) >.. 
gi|l38124|sp|P07S34|VG9 BPPZA TAIL PROTEIN (LATE PROTEIN GP9) >. . 
gi|1429238|erab|CAA676577 (X99260) tail protein [Bacteriophage B. . 
gi|215339 (M12456) p9 tail protein (Bacteriophage phi-29) >gi|2.. 
gij 1181968 |emb|CAA87738.l| (Z47794) tail protein (Bacteriophage.. 
gi|ll81970|einb|CAA87740.1| (Z47794) tail protein (Bacteriophage.. 

Ouery= sid| 110875 | lan|44AHJDORF005 Phage 44AHJD ORF| 12643-13890 | -1 
(415 letters) 

gi|3845203 (AE001399) GAP domain protein (cyclic nt signal tran. . . 
gij 3758843 |emb|CAB11128.1| (Z98551) predicted using hexExon,- MA... 
gi | 3845297 (AE001421) hypothetical protein [Plasmodium falciparum] 
gij4493936|emb|CAB38972.l| (AL034556) predicted using hexExon; ... 
gi | 3845165 (AE001390) hypothetical protein [Plasmodium falciparum] 

Query= sid| 110877 | lan |44AHJDORF007 Phage 44AHJD ORF| 2044-3027 | 1 
(327 letters) 

gi|1181960|emb|CAA87731.l| (Z47794) connector protein (Bacterio. . . 
gi|1429239|emb|CAA67658| (X99260) upper collar protein [Bacteri... 
gi j 137915 |sp|P07535|VG10_BPPZA UPPER COLLAR PROTEIN (CONNECTOR ... 
gi|l37914|sp|P04332 |VG10_BPPH2 UPPER COLLAR PROTEIN (CONNECTOR ... 

Query* sid| 110878 | lan | 44AHJDORF008 Phage 44AHJD 0RF| 3020-3775 | 2 
(251 letters) 

gi|4 982468|gb|AAD30963.2| (AF118151) SNF1 /AMP- activated kinase ... 
gij 1730077 |spjpi8160|KYKl DICDI NON - RECEPTOR TYROSINE KINASE SP. . . 
gi|3758855|emb|CAB11140.lT (Z98551) predicted using hexExon; MA... 
gij 585795 | sp | P21538 | REB1_YEAST DNA-BINDING PROTEIN REB1 (QBP) >... 
gi 1 172372 (M58728) DNA-binding protein [Saccharomyces cerevisiae] 
gi|2952545 (AF051898) coronin binding protein [Dictyostelium di... 
gij 535260 |emb|CAA82996| (Z30339) STARP antigen (Plasmodium re ic .. . 
gij 1429240 | emb | CAA67659 | (X99260) lower collar protein [Bacteri... 



55 


le-06 


53 


6e-06 


49 


le-04 


46 


0.001 


45 


0.002 


45 


0.002 


45 


0.002 


44 


0.004 


44 


0.004 


41 


0.041 


40 


0.070 


39 


0.092 



112 
52 
39 



92 
82 
78 
71 
54 
42 



52 
49 
48 
47 
46 



46 
45 
44 
41 



52 
46 
46 
46 
46 
45 
45 
44 



7e-24 
le-05 
0.10 



8e-18 
le-14 
2e-13 
2e-ll 
3e-06 
0.010 



6e-06 
5e-05 
le-04 
2e-04 
6e-04 



5e-04 
8e-04 
0.002 
0.009 



3e-06 

2e-04 

2e-04 

3e-04 

3e-04 

6e-04^ 

7e-04 

0.001 
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Query= sid| 110879 | lan| 44AHJDORF009 Phage 44AHJD ORF| 5744-6496 | 2 
(250 letters) 

gi| 2764981 |emb|CAA69021.1| (Y07739) N-acetylmuramoyl-L- alanine ... 
gij 113675 |sp|P24556|ALYS_STAAU AUTOLYSIN (N-ACETYLMURAMOYL-L-AL . . . 
gi I 1763243 (U72397) amidase {bacteriophage 80 alpha) 
gi|4574237|gb|AAD23962.l|AF106851_l (AF106851) LytN [Staphyloco. . . 
gi|3767593|dbj|BAA33856.l| <AB015l95) LytN (Staphylococcus aureus] 
gi|2764983|emb|CAA69022.1| (Y07740) cell wall hydrolase Plyl87 ... 
gi|3287732|sp|O05156|ALEl_STACP GLYCYL- GLYCINE ENDOPBPTIDASE AL. . . 
gi 1 79926 1 pir| |A25881 lysostaphin precursor - Staphylococcus sim. . . 
gi|l26496|sp|P1054B|LSTP STAST LYSOSTAPHIN PRECURSOR (GLYCYL-GL. . . 
gij 3287967 |sp|P10547|LSTP_STASI LYSOSTAPHIN PRECURSOR (GLYCYL-G... 
gi|3341932|dbj|BAA31898.lJ (AB009866) amidase (peptidoglycan hy. . . 

Query- sid| 110882 | lan | 44AHJDORF012 Phage 44AHJD ORF| 8391-8813 | 3 
(140 letters) 

gi|l40528|sp|P2481l|YQXH_BACSU HYPOTHETICAL 15.7 KD PROTEIN IN . 
gi|412663l|dbj|BAA36651.l| (AB016282) ORF45 [bacteriophage phi-. 
gi|l41088|sp|P26835|YNGD_CLOPB HYPOTHETICAL 14.9 KD PROTEIN IN . 
gi|2293160 (AF008220) YtJcC [Bacillus subtilis] >gi | 2635548 | embj . 
gi 1 1181973 1 emb | CAA87743.1) (Z47794) holin protein {Bacteriophag . 



180 
118 
118 
84 
84 
77 
73 
69 
69 
69 
68 



80 
76 
61 
36 
31 



le-44 
6e-26 
6e-26 
9e-16 
9e-16 
2e-13 
2e-12 
3e-ll 
3e-ll 
3e-ll 
6e-ll 



6e-15 
le-13 
4e-09 
0.099 
3.3 
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Table 20 



Homolgies between phage 44 AHJD ORFs and proteins in public databases 



Query- pt| 110871 44AHJDORF001 Phage 4 4 AHJD ORF | 10342-12627 | -1 1 
(761 letters) 

>gi| 118848 | sp|P19894 |DPOL_BPM2 DNA POLYMERASE >gi | 76896 | pir | | JQ0161 
DNA-directed DNA polymerase (EC 2.7.7.7) - phage M2 
>gi | 215509 (M33144) DNA polymerase (Bacteriophage M2) 
Length « 572 

Score = 55.4 bits (131), Expect = le-06 

Identities = 96/426 (22%) , Positives = 159/426 (36%), Gaps = 88/426 (20%) 



Query: 


229 


KLT PEQLTY I HNDVI I LGMCH I HYSD I FPNFDYNKLTFS LNIMES YLNNEMTR FQ 


283 






++TPE+ YI ND+ 1+ DI +++T + ++ ♦ + T+ F 




Sbjct: 


154 


EITPEEYEYIKNDIEIIARA LDIQFKQQLDRMTAGSDSLKGFKDILSTKKFNKVFP 


209 


Query: 


284 


LliNQYQDIKISYTHYHFHDMNFYDYII^FYRGGLNMYNTKYINKLIDEPCFSID 


343 






L+ D +1 + YRGG N KY K I E D+NS YP 




Sbjct: 


210 




252 


Query: 


344 


YVMYHEK I PTWL YFYEHYS E PTL I PTFLDDDNYF S L YKIDKD VFNDDLL I KI KS R VLRQM 


403 






MY+P YP+ +D+LYI+F+L K+ + 




Sbjct: 


253 


SQMYSRPLP YGAPI VFQGKYEKDEQYPLY- IQRIRFEFEL KEGYI PTI 


299 


Query: 


404 


XXXXXXXXXXXXXXXXXXLRMIQ-DITGIDCMHIRVNSFVIYECEYFHARDIIFQNYFIK 


462 






+ ++ +T +D 1+ + + +Y EY F + 




Sbjct: 


300 


QIKKNPFFKGNEYLKNSGVEPVELYLTNVDLELIQEH- YELYNVEYIDGFK FRE 


352 


Query: 


463 


TQGKLKNKINKTSP YDYHI TDDI NEHPYSNEEVMLS KWLNGLYG I PAL 


511 






G K+ 1+ + H + L+K++LN LYG +P L 




Sbjct : 


353 


KTGLFKDFIDKWTYVKTH EEGAKKQLAKLMLNSLYGKFASNPDVTGKVPYL 


403 


Query: 


512 


RSHFNL- FRLDDNNELYNI ING YKNTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDNF 


570 






+ +L FR+ D YK+ + F+T+ + + + Q D 




Sbjct: 


404 


KDDGSLGFRVGDEE YKDPVYT PM -GVFITAWARFTTI TAAQACY DRI 


449 


Query: 


571 


IYCDTDSLYMKSVVKPLLNPSLFDPIAI/SKWDIENEQIDKMFVLN^ YAYEVNG 


625 






IYCDTDS+++ P++DPLGWE+ +LK Y EV+G 




Sbjct: 


450 


I YCDTDS I HLTGTEVPE 1 1 KD I VD PKKLG YWAHES - TFKRAKYLRQKTY IQD I YVKEVDG 


508 


Query: 


626 


KIKIAS 631 








K+K S 




Sbjct: 


509 


KLKECS 514 





>gi | 1072656 | pir | (S51275 DNA polymerase - phage CP-1 

>gi|836593|erab|CAA87725.l| (Z47794) DNA polymerase 
[Bacteriophage CP-1) 
Length =» S68 

Score = 53.5 bits (126), Expect = 6e-06 

Identities « 104/464 (22%), Positives » 169/464 (36%), Gaps = 66/464 (14%) 

Query: 230 LTPEQLTYIHNDVI IL- -GMCHIHYSDIFPNFDYNKLTFSLNIMESYLNNEMTRFQLLNQ 287 

+ PE + YIH DV IL G+ ++Y ♦ F Y + +L + +F+ 
Sbjct: 152 IKPEWIDYIHVDVAILARGIFAMYYEENFTK--YTSASEALTEFKRIFRKSKRKFRDFFP 209 

Query: 288 YQDIKISYTHYHFHDMNFYDYIKSFYRGGLNMYNTKY 347 

D K+ D+ + G + K+ + +++ DINS YP M 

Sbjct: 210 ILDEKVD - - -DFCRKHIVGAGRLPTLKHRGRTLNQLIDIYDINSMYPATML 257 

Query: 348 HEKIPTWLYFYEHYSEPTLIPTFIiDDDNYFSLY-KIDKDVFNDDL-LIKIKSRvTiRQMXX 405 - 

+P + ♦ Y P ♦ +D+Y+ + K D D+ L I+IK ++ 

Sbjct: 258 QNALPIGIP - -KRYKGK PKEIKEDHYYIYHIKADFDLKRGYLPTIQIKKKLDALRIG 312 

Query: 406 XXXXXXXXXXXXXXXXLRMIQDITGIDC^IRVNSFVIYECEYFHARDIIFQNYFIKTOG 465 

L *■ + H +EF +F+Y 
Sbjct: 313 VRTSDYVTTSKNEVIDLYLTNFDLDLFUCHYDATIMYVETLE-FQTESDLFDDYI 366 
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Query: 466 KLKNKINMTSPYDYHITDDINEHPYSNEEVMLSKVVLNGLYGIPALR--SHFNLFRLDDN 523 

+ Y Y E+ S E +K++LN LYG ♦ S L LDD 

Sbjct: 367 - TTYRYK KENAQSPAEKQKAKIMLNSLYGKFGAKI I SVKKIiAYLDDK 412 

Query: 524 NELYNI INGYKNTERNIL FSTFVTS RS LYNLL VP FQYLTES E I DDNF I YCDTD S 577 

L +KN + + + FVTS + + ++ Q E DNF+Y DTDS 

Sbjct: 413 GILR FKNDDEEEVQPVYAP VALFVTS I ARHFI I SNAQ ENYDNFLYADTDS 462 

Query: 578 LYT^SVVKPLLNPSLFDPIAIXSKWDIENEQIDKMF^ 637 

L++ +L+ DP GKW E +K LKYE++ ♦ K 

Sbjct: 463 LHLFHSDSLVXiD IDPSEFGKWAHEGRAV- KAKYLRSKLYTEELIQEDGTTHLDV-KG 517 

Query: 638 AFDTSVDFETFVREQFFDGAIIENNKSIYNEQGTISIYPSKTEI 681 

AT E B F GA E ++ +G IY + +1 
Sbjct: 518 AGMTPE I KEKI TFEKFV1 GATFEGKRAS KQI KGGTLI YETTFKI 561 



>gi|1429230|emb|CAA67649| (X99260) DNA polymerase (Bacteriophage 
B103) 

Length ° 572 
Score -49.2 bits (115), Expect = le-04 

Identities » 93/422 (22%), Positives = 155/422 (36%), Gaps » 88/422 (20%) 

Query: 229 KLTPEQLTYIHNDVIILGMCHIHYSDIFPNFDYNKLTFSLNIMESYIiNNEMTR FQ 283 

++TPE+ YI ND+ 1+ DI +++T + ++ + + T+ F 

Sbjct: 154 EITPEEYEYIKNDIEIIARA LDIQFKQGLDRMTAGSDSLKGFKDILSTKKFNKVFP 209 

Query: 284 LLNQ YQD I KI S YTHYHFHDMNFYD YI KSFYRGGLNMYNTKY INKIiIDEPCFS ID INSS YP 343 

L+ D +1 ♦ YRGG N KY K I E D+NS YP 

Sbjct: 210 KLSLPMDKEI-- - RRAYRGGFTWLNDKYKEKEIGEGMV- FDVNSLYP 252 

Query: 344 YVMYHEKIPTWLYFYEHYSEPTLIPTFLDDDNYFSLYKIDKDVF^ 4 03 

MY +p Y P + + D + LY 1 + F +L K+ + 
Sbjct: 253 SQMYSRPLP - YGAPIVFQGKYEKDEQYPLY- IQRIRFEFEL KEGYIPTI 299 

Query: 404 XXXXXXXXXXXXXXXXXXIJ^MIQ-DITGIDCMHIRVNSFVIYECEYFHARDIIFQNYFIK 462 

+♦ +T +D 1+ + + +Y EY F + 

Sbjct: 300 QI KKNPFFKGNEYLKNSGAEPVELYLTNVDLELIQEH - YEMYNVEYIDGFK FRE 352 

Query: 463 TQGKLKNKINMTS PYDYH ITDDINEHP YSNEEVMLS KWLNGLYG IPAL 511 

G K 1+ + H + L+K++ + LYG +P L 

Sbjct: 353 KTGLFKEFIDKWTYVKTH ERGAKKQLAKLMFDSLYGKFASNPDVTGKVPYL 403 

Query: 512 RSHFNL-FRLDDNNELYNIINGYKNTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDNF 570 

+ +L FR+ D YK+ + F+T+ + + + Q D 
Sbjct: 404 KEDGSLGFRVGDEE YKDPVYTPM - GVFITAWARFTTITAAQACY DRI 449 

Query: 571 I YCDTDS LYMKS WKPLLNPS LFD P I ALGKWDI ENEQ IDKMFVLNHKK YAYEVNG 625 

IYCDTDS+++ P + + DP LG W E+ + L K YA EV+G 

Sbjct: 450 IYaOTSIHLTGTEWEIIKDIVDPKKIiGYWAHES-TFKRAKYLRQ 508 

Query: 626 KI 627 
K+ 

Sbjct: 509 KL 510 

>gi|l572479|emb|CAA65712| (X96987) DNA polymerase [Bacteriophage 
GA-1) 

Length = 578 
Score ■ 46.1 bits (107), Expect = 0.001 

Identities = 80/376 (21%), Positives = 146/376 (38%), Gaps « 54/376 (14%) 

Query: 234 QLTYIHNDVIILGMCHIHYSDIFPNFDYNKLTFSI^IMESYLNNEMTRFQUJIQYQDIKI 293 

++ Y+ +D++I+ + +PND+ +T + + +Y EM ♦ +Y + 

Sbjct: 162 EIEYLKHDLLIVALA IJaSMFDN-DFTSMTVTjSDALNTY- - KEMLGVKQWEKYFPVL- 214 

Query: 294 S YTHYH FHDMN FYDYI KS FYRGGLNMYNTKYINKL I DE PCFS I D INS S YP YVMYHEKI PT 353 

+ 1+ Y+GG N KY + + D+NS YP +M ++ +P 

Sbjct: 215 - -SLKVNSEIRKAYKGGFTWWPKYQGETVTGGMV-FDVNS 264 
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sbjct: 265 YGEPVMFKGEYKKNVEYPLYIQQVRCFFEIiKKDKI PCI QI KGNARFGQNEYLS 317 

Ouerv- 414 XXXXXXXXLRM I QDITG I DCMH I RVNS FVI YECEYFHARDI I FQNYFI KTQGKLKNKINM 473 

J * L +t +D X* + + X+B E+ +F+ + I 
Sbjct: 318 TSGDEYVDLY VTNVDWELI KKH-YDIFEEEF I GG > -FMFKGF IGF 359 

Ouerv- 474 TSPYDYHITDDINEHPYSNEEVMLSKWLNGLYGI PALRSHFN- - LFRLDDNNELYNI IN 531 

Wy Y +NSE+ + +K++LN LYG A + LD+N L 

Sbjct: 360 FDEY I DRFME I KNS PDS SAEQS LQAKLMLNSLYG1CFATNPD ITGKVP YIiDElJGVLKFRKG 419 

Ouerv- S32 GYKNTERNILFST FVTSRSLYNLLVPFQYLTESEIDDNFIYCDTDSLYMKSWKPLL 589 

K ER+ F+T+ + N+L Q L FIY DTDS++++ + + 

Sbjct: 420 ELK- - ERDPVYTPMGCFITAYARENILSNAQKLYP RFIYADTDSIHVEGLGEVDA 472 

Query: 589 NPSLFDPIALGKWDIB 604 

+ DP LG WD B 
Sbjct l 473 IKDVIDPKKLGYWDHE 488 

>qi| 118851 |sp|P069S0|DPOL BPPZA DMA POLYMERASE (EARLY PROTEIN GP2) 
>gi|75812|pir|TERBP2Z DNA-directed DNA polymerase (EC 
2 7 7 7) - phage PZA >gi|216051 (M11B13) gene 2 product 
[Bacteriophage PZA] >gi 1 224741 |prf | |1112171E ORF 2 
[Bacteriophage PZA] 
Length =572 

Score = 45.3 bits (105), Expect = 0.002 m*\ 
Identities = 98/461 (21%), Positives = 166/461 (35%), Gaps . 110/461 (23%) 

Ouerv 198 QLKTDF1TYTI FDKDNDMNDS EAYDYAVKCFAKLT PEQLTYI HNDVI I LGMCHIHYSDI F P 257 

+♦ DF T+ D D + Y ++TP++ YI ND+ 1+ + I 

Sbjct: 129 KIA10>FKLTVLKGDIDYHKERPVGY EITPDEYAYIKNDIQIIAEALL IQF 178 

Ouerv- 258 NFDYNKLTFSLNIMESYLNNBMTR FQLLNQYQD I KI SYTHYHFHDMNFYDYI KS F 312 

+++T + + + T+ F L+ D ++ Y 
Sbjct: 179 KQG LDRMT AG S DD LKG FKD 1 1 TTKKFKKVF PTLS LGLDKEVR YA - 222 

Query- 313 YRGGI^NMYNTKYINKLIDEPCFSIDINSSYPYVMYHEKI PTWLYFYEHYSEPTLI PT- - F 370 

YRGG N ++ K I E D+NS YP MY +P Y EP + 

Sbjct: 223 YRGGFTWLNDRFKEKEIGEGMV- FDVNSLYPAQMYSRLLP - YGE P I VFEGKYV 273 

Query- 371 LDDDNYFSLYKID KDVFNDDLLIKIKSRVIJ^Q^1XXXXXXXXXXXXXXXXXXL^ 425 

" D+D +1 K+ + + IK +SR + 
Sbjct: 274 WDEDYPUJIQHIRCEFELKEGYIPTIQIK-RSRFYKGNEYLKSSGGEIADLW 324 

Query: 426 QDI TGIDCMHIRVTJSF VI YECEYFHARDI I FQ!WFIKTCGK^ 485 

+♦ +D + + ♦ +Y EY F T G K+ 1+ + I 

Sbjct: 325 - - VSNVD - LELMKEHYDLYNVEYI SGLK FKATTGLFKDFIDKWTHIKTTSEGAI 375 

Query 486 NEHPYSNEEVMLSKWLNGLYG I PALRSHFNL- FRLDDNNELYNI INGY 533 

+ L+K++LN LYG +P L+ + L FRL G 
Sbjct: 376 KQ LAKLMLNS LYG KFASN PDVTGKVPYLKENGALGFRL GE 415 

Ouerv- 534 KNTERNIL- -FSTFVTSRSLYNIaLVPFQYLTESEIDDNFIYCDTDSLYMKSWKPLLNPS 591 

^ ^ + T + + F + T + + Y ♦ Q D IYCDTDS+** P + 

Sbjct: 416 EETKDPVYTPMGVFITAWARYTTITAAQACF DRIIYCDTDSIHLTGTEIPDVIKD 470 

Query: 592 LFDPIALGKWDIENEQIDKMFVLNHKKYAY BVNGKI 627 

^ DP LG W E+ + LKY EV+GK+ 
Sbjct: 471 IVDPKKLGYWAHES -TFIOXAKYLRQKTYIQDI YMKBVDGKL 510 

>gi | 2435429 (AF012250) unasBigned reading frame (possible DNA 
polymerase) [Physamm polycephalum] 
Length =544 

Score =44.9 bits (104), Expect = 0.002 

Identities = 118/545 (21%), Positives « 206/545 (37%), Gaps = 104/545 (19%) 

Query- 179 TS I ATLGKKLLDGGYLTESQLKTDFNYT I FDKDNDMNDS EAYDYAVKCFAKLTPEQLTYI 238 ' 

T+LKLD+TQ FNM Y+CFLP++I 
Sbjct: 62 TQLFNLLKS LQDSS FYT FKQ FTYQNIM YSLEISCF- -LYPKKKILI 105 

Query- 239 HNDVI I LGMCHIHYSDI FPNFD YNKL- -TFSLNIMESY-LNNEMTRFQLLNQYQD 290 

D+ +1 Y+D+ ++ YN++ +++NI Y L+ ++ + 

Sbjct- 106 - KD LYNFFS ENI I YND WKD YKLLAI LYNE I QTAYNI N I NRKY I LSTASLSLRI FKKS F P 164 
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Query: 


291 


Sbjct: 


165 


Query: 


351 


Sbjct: 


218 


Query: 


403 


Sbjct: 


274 


Query: 


463 


Sbjct: 


324 


Query: 


514 


Sbjct: 


364 


Query: 


574 


Sbjct: 


422 


Query: 


633 


Sbjct: 


478 


Query: 


679 


Sbjct: 


536 



287 



K + D + +YI + YGGN I + + + + D+NS YPY+M EK 

EKYR.L I PHLTRDED - - NYI RKS YIGGRNE IFEHVAQRNYFYDVNSLYPYIMKKEK 217 

I PTWLYFYEHYSEPTLI PTFLDD - DNYFS L YKI D KD VFNDD LL - - - IKIKSRVLRQ 402 

♦p ♦ y + + F + +N+F L I+K N +L + IK+ V 



L + Q 1 + IY + ++++F+ Y + 

I YAKGTLRGIYFSEEIICLALKQGYKI IE - - - I YS AYE YKEKEWFEEYVEQ 323 

3K- LKNKINMTSPYDYHITDDINEHPYSNEEVMLSKWLNGLYG IPALRS 513 

+ LK K D+D L K +LN LYG 1+ 



+ DN + + ++ N++ + ++ + FYT + + IY 
- ITDNTYISHinTEFIDITANTCYNNIAITSAITSYARIFKYNTILNYNliHVIYI 421 



DTD L++K+ P+ ♦ +L +GK+ +E+ + F+ N K Y Y +N I 

DTDGLFLKN PIPDIALTTSKEMGKFRLESINAEAHFIAN-KFYIYAPINSPIIYKFK 477 

GIP K NAFDTSVDFETFVR EQFFDGAZ IENNKS XYNEQGT ISIYPSK 678 

GIP N D + + +F +1 NN Y+ Q + I Y + 



I+C 



>gi|1084487|pir| |S41618 DNA polymerase - slime mold (Physarum 

polycephalum) >gi | 509721 |dbj |BAA06121.1 | (D29637) DNA 
polymerase {Physarum polycephalum J 
Length « 547 

Score = 44.9 bits (104), Expect - 0.002 

Identities = 118/545 (21%) , Positives - 206/545 (37%) , Gaps = 104/545 (19%) 

Query: 179 TS IATLGKKIjLDGGYLTESQLKTDFNYTI FDKDNDMNDSEAYDYAVKCFAKLTPEQLTYI 238 

T+LKLD+TQ F NM Y + CF L P++ I 

Sbjct: 65 TQLFNLLKSLQDSSFYTFKQ FTYQNIM YSLEISCF--LYPKKKILI 108 

Query: 239 HNDVI I LGMCH I HYSD I F PNFD YNKL- -TFSLNIMESY- LNNEMTRFQLLNQYQD 290 

D+ +1 Y+D+ ++ YN++ +++NI Y L+ ++ + 

Sbjct: 109 -KDLYNFFSENIIYNDWKDYKLLAILYNEIQTAYNININRKYILSTASLSLRIFKKSFP 167 

Query: 291 I KI SYTHYHFHDMNFYDYI KS FYRGGLNMYNTKYINKLI DEPCFS ID INS S YPYVMYHEK 350 

K + D + +YI + YGGN I + + + + D+NS YPY+M EK 

Sbjct: 168 EKYRLI PHLTRDED- -NYIRKSYIGGRNE I FEHVAQRNYFYDVNSLY P YI MKKEK 220 

Query: 351 I PTWLYFYEHYSEPTLI PTFLDD - DNYFS LYKIDKDVFNDDLL IKIKSRVLRQ 402 

+ p + Y + + F + +N+F L I+K N +L + IK+ V 

Sbjct: 221 MPIGI pEYRDKEYMKKFEKNIENFFGFIDVLITIEKTNNNIPVIJ'YRMGIKNNV-EV 276 

Query: 403 hDCXXXXXXXXXXXXXXXXXLRMIQDITGIDCMHIRVNSFVIYECEYFHARDIIFQNYFIK 462 

L + Q 1 + IY + ++++F+ Y + 

Sbjct: 277 GIIYAKGTLRGIYFSEEIKLALKQGYKI IE IYSAYEYKEKEWFEEYVEQ 326 

Query: 463 TQGK- LKNKIN^SPYDYHITDDINEHPYSNEETVMLSKVVLNGLYG IPALRS 513 

+ LK K D+D L K +LN LYG I + 

Sbjct: 327 MYNRRLKAK DPALKD LYKKLLNTLYGRFGLVYEQ I D 1 1 S P 366 

Query: 514 HFNLFRLDDNNELYNIINGYKNTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDNFIYC 573 

L+DN++ + + N++ ++++ FYT + + IY 
Sbjct: 367 EKEL--ITDNTYISHDTTEFIDITANTCYNNIAITSAITSYARIFMYNTILNYNLHVIYI 424 

Query: 574 DTDSLYMKSVVKPLLNPSLFDPIALGKWDIENEQIDKMFVLN^ 632 

DTD L++K+ P+ + +L +GK+ +E+ + F+ N K Y Y +N I 

Sbjct: 425 DTDGLFLKN piPDIALTTSKEMGKFRLESINAEAHFIAN-KFYIYAPINSPIIYKFK 480 

Query: 633 GIPK NAFDTSVDFETFVR EQFFDGAI I ENNKS I YNEQGT ISIYPSK 678 
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GIP n D + ♦ +F +1 NN ' Y+ Q + I Y + 

Sbjct: 481 GIPI^KPIFNIHDIITQHKKIItflTLGHHYFTFSIRLN^ 540 

Query: 679 TEIVC 683 
Sbjct: 541 PWIIC 545 

>gi|4877819|gb|AAD31446.l| (AF133505) DNA polymerase (Neurospora 
craasa] 
Length =1035 

Score » 44.1 bits (102). Expect = 0.004 

Identities » 36/172 (20ft), Positives = 82/172 (46%), Gaps « 14/172 (8%) 

Query: 521 DDNNEI*YNIINGYKNTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDNFIYCDTDSLYM 580 

+ N EL + ++G K+ I ++ + + ++ +♦ +++♦ S Y DTDS+-M- 

Sbjct: 817 EKNYELLS YLDGEKDDGFI INSTS I AAATASWSRILMYKHI I NSA YTDTDSIFV 870 

Query: 581 KSVVXPLLNPSLFDPIALGKWDIENEQIDKMFVL^ 640 

+ KPL + ++ K+ +1+ ++KY + GK++I GI KN + 

SbjCt: 871 B KPLDSAFIGEGCGKFKAEYNGQL1 KRAI FI SGKLYLLDFGGKLEIKCKGITKNKDN 927 

Query: 641 TSVDFETFVREQFFDG AI IENNKS I YNEQGTIS I YPS KTEIVCGNVYDB 689 

T+ + + E ++G + + E GT+++ K ++ G YD+ 

Sbjct: 928 TTHNUDINDFEALYNGESRVLFQERWGRSLEIX3TVTVKYQKYNLISG— YDK 977 



>gi|461962|sp|P33537|DPOM NEUCR PROBABLE DNA POLYMERASE 

>gi | 283351 |pirJ|S26985 probable DNA-directed DNA 

polymerase (EC 2.7.7.7) - Neurospora crassa 

mitochondrion plasmid maranhar (SGC3) 

>gi| 578156 |emb|CAA39046 | (X55361) putative DNA 

polymerase (Neurospora crassa] 

Length = 1021 

Score « 44.1 bits (102), Expect = 0.004 

Identities * 36/172 (20ft), Positives = 82/172 (46%), Gaps = 14/172 (8ft) 

Query: 521 DDNNELYN I ING YKNTERNI LFS TFVTSRSLYNLLVPFQYLTES E I DDNF I YCDTDSLYM 580 

+ N EL + +-fG K+ I ++ + + ++ ++ ■♦■+++ S Y DTDS+++ 

Sbjct: 815 EKNYELLSYLDGEKDDGFI INSTS IAAATASWSRILMYKHI INSA YTDTDSIFV 868 

Query: 581 KSVVTCPLLNPSLFDPIALGKWDIENEQIDKMFVLNH^ 640 

+ KPL + +♦ K+ +1+ ++KY + GK++I GI KN + 

Sbjct: 869 E KPLDSAFIGEGOGKFKAEYNGQLIKRAIFISGKLYLLDFGGKLEIKCKGTTKNKDN 925 

Query: 641 TSVDFETFVREQFFDG AIIENNKSIYNEQGTISIYPSKTEIVCGNVYDE 689 

T+ + + E ++G + + E GT+++ K ++ G YD+ 

Sbjct: 926 TTHNIiDINDFEALYNGESRVI*FQERWGRSLEI/»TVTVlCYQKYNLISG- - YDK 975 



>gi|249951l|sp|Q1247l|6P22_YEAST 6 - PHOS PHOFRUCTO - 2 - KINASE 2 
(PHOSPHOFRUCTOKINASB 2 II) (6PF-2-K 2) 
>gi|2131162|pir| |S61066 6-phosphof ructo- 2 -kinase (EC 
2.7.1.105) - yeast (Saccharomyces cerevisiae) 
>gi|2131163|pir| |S71026 6-phosphof rue to- 2 -kinase (EC 
2.7.1.105) - yeast (Saccharomyces cerevisiae) 
>gi| 1085116 |emb|CAA623 71 | (X90861) 
6-phosphofructo-2-kinase (Saccharomyces cerevisiae] 
>gi|l420028|emb|CAA99157| (Z74878) ORF Y0L136C 
(Saccharomyces cerevisiae] >gi| 1628439 lerab|CAA64733| 
(X95465) 6-phosphofructo-2-kinase (Saccharomyces 
cerevisiae] 
Length » 397 

Score = 40.6 bits (93), Expect » 0.041 

Identities « 48/208 (23ft), Positives = 92/208 (44%), Gaps = 29/208 (13%) 

Query: 175 MKTNTSIATIXaKKLLDGG YLTESQIJCTO EAYDYAVKCFAKLTPEQ 234 

++ S AT+ K LL L+ + + FN K+ND ++ +A++T ++ 

Sbjct: 139 IRRQISCATISKPLL LSNTSSEDLFN PKNNDKKET YARITLQK 181 

Query: 235 LTY - 1 HNDVI I LGMCHIHYSDI FPNFDYNKLTFSLNIMESYLNNEMTRFQLLN QYQD 290 

L + I+ND +G+ SI ♦ F ♦ S+ +E++ F L+ Q 

Sbjct: 182 LFHEINNDECDVGIFDATNSTI ERRRF I FEEVCS FNTDELS S FNLVP 1 1 LQVS C 235 
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Query: 291 I KI S YTHYH FHDMNFY - D Y I KS FYRGG LNMYNTKYI NKLI DE PCFS I D - 1 NS S YP YVMYH 34 8 

S+ Y+ H+ +F DY+ Y «■ + + + FS+D N + Y+ H 

Sbjct: 236 FNRSF I KYN I HNKS FNED YLDKP YELAI KDFAKRLKHYYSQFT PFS LDE FNQI HRYI SQH 295 

Query: 34 9 EKI PTWLYFYEHYSEPTLI PTFLDDDNY 376 

E+I T L+F+ ♦ + P L+ +Y 
Sbjct: 296 EEIDTSLFFFNVINAGWEPHSLNQSHY 323 



>gi| 2258375 |gb| AAD11909. 1 | (AF007261) transcription initiation 
factor sigma [Reclinomonas americanaj 
Length ° 532 

Score =* 39.9 bits (91), Expect =» 0.070 

Identities - 49/205 (23ft) , Positives * 84/205 (40ft) , Gaps » 14/205 (6%) 

Query: 100 NHFLIjKDTMR YFDNITRENI YIJCSAEENEHTLKMKEAT I LAKNQNV I L E KRVKSS IN 156 

N+ ♦ ♦ F + ++IY+ ♦ +KE L K NVI + K +K N 

Sbjct: 177 NYLVKNS YLNL FKTVPHDS I YMNYS Y I QTPLN I LKE YLQLI KI INVI I LQ I NKNI KKKNN 236 

Query: 157 LDLTMFLNG FKFNI I DNFM - - - KTNTS IATLGKKLLDGG YLTESQLKTD FNYTI FDKDND 213 

L++++FL F ♦ N++ K + ♦ + K L Y+T L T Y K 
Sbjct: 237 LNISIiFLYKFTQELKWNYIFINKISRNTQKINIKTLKNSYITFYN^ 296 

Query: 214 MNDSEAYDYAVKCFAK- - LTPEQLTYIHNDVI ILGMCHIHYSDIFPNFDYN- KLTFSLNI 270 

D +K F K P+ +N +1 G+ HI* + N K+T I 

Sbjct: 297 KKDIFYKQIFIKTFIiKQHKIPKINKIKNNSLIKYGLTHITO^ 356 

Query: 271 MESYLNNEMTRFQLLNQYQDIKISY 295 

♦ +Y+ T + QY +KI Y 
Sbjct: 357 IFNYMPYITT ISKQY- -VKIGY 376 



>gi| 15734 |emb|CAA37450| (X53370) DNA polymerase (AA 1-575) 
[Bacteriophage phi- 29) 
Length =575 

Score a 39.5 bits (90), Expect « 0.092 

Identities * 41/150 (27%), Positives = 64/150 (42%), Gaps = 36/150 (24%) 

Query: 497 LS KWLNGLYG - I P ALRSHFNL - FRLDDNNELYN 1 I NG YKNTERN I L - - F 542 

L+K++LN LYG +P L+ + L FRL G + T+ + 

Sbjct: 381 LAKLMLNSLYGKFASNPDVTGKVPYLKENGALGFRL GEEETKDPVYTPM 429 

Query: 543 STFVTSRSLYNLLVPFQYLTESEIDDNFIYCOTDSLYMKSVVKPLLNPSLFDPIALGKWD 602 

F+T+ + Y + Q D IYCDTDS+++ P + + DP LG W 

Sbjct: 430 GVFITAWARYTTITAAQACY DRIIYCDTDSIHLTGTEIPDVIKDIYDPKKLGYWA 484 

Query: 603 IENEQIDKMFVLNHKKYAY EVNGKI 627 

E+ ++ L K Y EV+GK+ 
Sbjct: 485 HES - TFKRVKYLRQKTYI QDI YMKEVDGKL 513 



Query* pt| 110872 44AHJDORF002 Phage 44AHJD ORF | 3789-5732) 3 1 
(647 letters) 

>gi|l35273|sp|P27622|TAGC_BACSU TEICHOIC ACID BIOSYNTHESIS PROTEIN C 
>gi | 478126 |pir| {D49757 techoic acid biosynthesis protein 
tagC - Bacillus subtilis (strain 168) >gi| 143727 
(M57497) putative (Bacillus subtilis] 
>gi 12636103 | erob|CAB15S94.1 | (Z99122) alternate gene 
name: dinC [Bacillus subtilis] 
Length « 442 

Score = 112 bits (278), Expect « 7e-24 

Identities = 91/314 (28%), Positives = 147/314 (45%), Gaps - 58/314 (18%) 

Query: 152 FELNELEPKFVMGFGGIRNAVNQSINIDKETNHMYSTQSDS QKPEGFWINKLTPSG 207 

F* + PK V QS N D++ + +Y+TQ S + + I +L+ G 

Sbjct: 7 FDFTNITPKLFTELRVADKTVLQSFNFDEKNHQIYTrQV7^^ 66 

Query: 208 DLISSMRIVQGGHGTTIGLERQSNGEMKIWLHHD GVAKLLQVAYKDNYVLDLEEA 262 

+ SM + GGHGT IG+E + NG + IW +D ++L+ YK LD E + 

Sbjct: 67 LQLDSMLIiKHGGHGTNlGIENR-NGTIYIWSLYDKPNETDKSELVCFPYKAGATLD-ENS 124 
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Query: 263 KG LTD YTPQSLLNKHT FT PLIDEANDKL I LRFGDGT I QVRS RADVKNH IDNVEKEKT I DN 322 

K L ++ H TP +D N +L +R + D KN+ N ++ +TI N 

Sbjct: 125 KELQRFSNMPF - - DHRVTPALDMKNRQLA I R QYDTKNN- -NNKQWVTIFN 170 

Query: 323 SE NNDN RWMQG I AVDGDDLYWLSGNSS VNSHVQIGKYS LTTGQKI 367 

+ N +N ++QG +D LYW +G+++ S+ + + 
Sbjct: 171 LDDAIANKNNPLYTINI PDELHYLQGFFLDDGYLYWYTGDTNSKSYPNL- - I TV 222 

Query: 368 YDYPFKLSYQDGINFPRD NFKEPEGICIYTNPKTKRKSLLLAMTNGGGGKRFH 420 

+D K+ Q I +D NF+EPEGIC+YTNP+T KSL++ +T+G G R 

Sbjct: 223 FDSDNXIVLQKEITVGKDLSTRYENNFREPEGICMY^ 282 

Query: 421 NLYGFFQLGEYEHF 434 

+Y + YB+P 
Sbjct: 283 RIYAYH SYENF 293 



>gi | 142847 (M64050) DNase inhibitor (Bacillus Bubtilis} 
Length « 125 

Score « 51.9 bits (122), Expect * le-05 

Identities » 35/116 (30ft) , Positives « 55/116 (47%) , Gaps « 10/116 (8ft) 

Query: 152 FELNELEPKFVMGFGG1 RNAVNQSINIDKETNHMYSTQSDS QKPEGFWINKLTPSG 207 

F+ + PK V QS N D++ + +Y+TQ S + + I +L+ G 

Sbjct: 7 FDFTNITPKLFTELRVADKTVIiQSFWroEKNHQ 66 

Query: 208 DLISSMRIVQGGHGTTIGLERQSNGEMKIWLHHD GVAKLLQVAYKDNYVLD 258 

♦ SM + GGHGT IG+E + NG + IW +D ++L+ YK LD 

Sbjct: 67 IjQLDSMIjIiKHGGHGTNIGMENR-NGTIYIWSLYDKPNETDKSELVCFPYKAGATLD 121 



>gi | 4038407 (AF103943) factor C protein precursor (Streptomyces 
griseus] 
Length = 324 

Score = 39.1 bits (89), Expect = 0,10 

Identities = 61/269 (22ft), Positives = 102/269 (37%), Gaps = 33/269 (12%) 

Query: 172 VNQSINIDKETNHMYSTQSDSQKPEG FWINKLTPSGDLISSMRIVQGGHGTTIGLER 228 

V QS D ++ Q S P+ I +L SG+ + M ++ GHG +IG + 

Sbjct: 66 VQQSFTFDIVNRRLFVAQLKSGSPDDSGDLCITQIJ)FSGNKLGHMYL^ 124 

Query: 229 QSNG EMK I WLHHDG VAKLLQ VA YKDNYVLD LE EAKG LTD YT PQS LLNKHTFT P 281 

+ +WD + + + + GT S L KH P 

Sbjct: 125 PVGADTYLWTEVD VNSNARGTRIiARFKWNNGATLSRTSSALAKHQPVPGATEMTC 179 



Query: 282 L I D EAND KLI LR FGDGT I QVRS RADVKNH I DNVEKEMTIDNS ENNDNRWMQG I AVDGDDL 341 

ID N+++ +R+ ++ +V+ V+D QGA+G + 

Sbjct: 180 AID P VNNRMAI RYLTASGRRYG I YNVAD I AAG VYDKPLSDVPH PTGLGTFQG YALYGS YV 239 

Query: 342 YWLSGN S S VNSHVQ I GKYS LTTGQKI YD YPFKLSYQDG INF PRDN FKEPEG I C 394 

Y L+GN + NS+V + TG + + + G F+EPEG+ 
Sbjct: 240 YQLTGNPYGPDNPNPGNSYVS- -SVDVNTGALVQ RAFTRAGSTL TFREPEGMG 290 

Query: 395 IYTNPKTKRKSLLLAKTNGGGGKRFHNLY 423 

IY + + L L +G G R NL+ 

Sbjct: 291 I YRTAAGEVR - LFLGFASG VAGDRRSNLF 318 



Query= pt| 110873 44AHJDORF003 Phage 44AHJD ORF | 6626 -8389 | 2 1 
(587 letters) 

>gi|l38123|sp|P0433l|VG9_BPPH2 TAIL PROTEIN (LATE PROTEIN GP9) 
>gi | 75850 j pir | |WMBPT9 gene 9 protein - phage phi-29 
>gi | 215327 (M14782) tail protein [Bacteriophage phi-29) 
>gi| 225364 |prf | |1301270D gene 9 [Bacillus sp.) 
Length « 599 

Score «= 92.4 bits (226), Expect ■ 8e-18 

Identities « 126/618 (20ft), Positives * 251/618 (40ft), Gaps = 71/618 (lift) 



Query: 5 TNFKFFYNTPFT-DYQNTIHFNSNKERDDYFLNGRHFKSLDYSKQPY-NFIRDRMEINVD 62 

TN + + PF-f DY+NT F S+ ♦ ++F R + + SK + F ++ ++V 
Sbjct: 9 TNVRILADVPFSNDYKNTRWFTSSSNQYNWF- - NRKSRVYEMSKVTFMGFRENKPYVSVS 66 
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Query: 63 MQWHDAQGINYMTFLS - DFEDRRYYAFVNQIEYVNDVWKI YFVIDTI MTYTQGNVLEQL 121 

♦ +Y+ F + D+ +♦ +YAFV ++E+ N V ++F ID + T+ ++ 

Sbjct: 67 LP I DKL YS AS Y IMFQNADYGNKWFYAFVTELE FKK S AVTYVHFEID VLQTWMFDMKFQES 126 

Query: 122 SNVNI ERQHLS KRTYNYMLPPttJWOTD^KVSNKNYVYNQMQQYI*ENL VLFQS S ADL5 KK 181 

I R+H+ K + P+D+L+++ + ♦ ♦♦P S 

Sbjct: 127 F IVREHV-KLWNDKrrPTINTIDEGLSYGSEYDIVSVKNHKPYDDMMFLVIISKSIM 182 

Query: 182 FGT-- KKEPNLDTSKGTIYDNITSPVNLYVMEYGDFINFMDKMSAYPWITQNFQK V 235 

GT ++E L+ + + P+ Y+ + + D +1 N V 

Sbjct: 183 HGTPGEEESRLNDINASL- NGMPQPLCYYIHPF YKDGKVPKTYI GDNNANLS P I V 236 

Query: 236 QMLPKDFINTKDLEDVKTSEKITGLKTLKQGGKSKEWSIJC-DIiSL-- --SFSNLQ 285 

ML F ♦ D+ «■ +T LK K+ + LK D + N+ 

Sbjct: 237 NMLTNIFSQKSAVNDI -VNMYVTDYIGLKLDYKNGDKELKItDKDMFEQAGIADDKKGNVD 295 

Query: 286 EMMLSK KDEFKHMI RNE YMTI E FYDWNGNTMLLDAGKI S QK 326 

♦ + K KD+++YED+GNML 1+ 

Sbjct: 296 T I FVKKI PDYEALE IDTGDKWGGFTKDQESKLMMYP YCVTE ITDFKGNHMNLKTEYINNS 355 

Query: 327 TGVKLRTKS 1 1 GYHNEVRVYPVD YNS AEND RP I LAKNKE I L I DTGS FLNTN I TFNS F AQV 386 

+K++ ■»• +G N+V DYN+ D + N+ S +N N 
Sbjct: 356 K - LKI QVRGS LGVSNKVAYS VQD YNA DSALSGGNRLT ASLDSS LIKNNPN 404 

Query: 387 P I LINNG I LGQ SQQANRQ - - KNAES QL I TNRI DNVLNG SD PKSRFYDAVS VASNLS P 441 

I I N L Q N+ +N +S ++ N I ++ G + + A+ +AS++ 

Query: 442 TALFGKFNEEYNFYKQQQAEYKDIAIiQPPSVTESEMGNAFQIANSINGLTMKISVPSPKE 501 

T + + QA+ D+A PP +T+ AF N G+ + + 

Sbjct: 463 TGMTSTAGNAVLQMQAMQAKQAD I ANI PPQLTKMGGNTAFD YGNG YRG VYVI KKQLKAEY 522 

Query: 502 ITFLQKYYMLFGFEVNDYNSFIEPINSMTVCNYIiKCTGTCT^ 561 

L ++ +G+++N + + NY++ + DI+ +++++ I ++G 

Sbjct: 523 RRSLSSFFHKYGYKINRVKK- - PNLRTRXAFNYVQTKDCFISGDINNNDLQEIRTIFDNG 580 

Query: 562 VRFWHNDGSGNPMLQNPL 579 

+ WH D GN ++N L 
Sbjct: 581 ITLWHTDNIGNYSVENEL 598 



>gi| 138124 |sp|P07534|VG9_BPPZA TAIL PROTEIN (LATE PROTEIN GP9) 
>gi | 75849 jpir | (WMBP9Z gene 9 protein - phage PZA 
>gi 1 216058 (M11813) tail protein (Bacteriophage PZA] 
Length « 599 

Score = 81.9 bits (199), Expect = le-14 

Identities « 127/618 (20%), Positives = 248/618 (39%), Gaps = 71/618 (11%) 

TNFKFFYNTPFT- DYQNTI HFNSNKERDD YFLNGRHFKS LD YS KQPYNF I RDRME - INVD 6 2 
TN + + PF+ DY+NT F S+ + ++F + + SK + R+ I+V 

TNVRI LADVPF SNDYKNTRWFTSSSNQ YNWF - -NSKTRVYEMSKVTFQGFRENKSYISVS 66 

MQWHDAQGINYMTFLS -DFEDRRYYAFVNQIEYVNDVVVKIYFVIDTIMTYTQGNVLEQL 121 
++ + y+ F «► D+ ++ +YAFV ++EY N ++F ID ♦ T+ N+ Q 

IjRIiDLLYNASYIMFQNADYGNKWFYAFVTELE YKNVGTTYVHFE IDVLQTW - MFNIKFQE 125 



Query: 


5 


Sbjct: 


9 


Query: 


63 


Sbjct: 


67 


Query: 


122 


Sbjct: 


126 


Query: 


180 


Sbjct: 


183 


Query: 


222 


Sbjct: 


242 


Query: 


274 


Sbjct: 


300 


Query: 


331 



I R+H+ K 



+ E L+ 



P + D+ L ++ + + 



+ Y + + L 



++ + + P+ Y+ 



-GD- 
GD 



-FINFMDK 221 
+N + 



QNFQKVQMLPKDFINTK DLEDVKTSEKITGLKTLKQGGKSKEWS 273 

N V M D+I K +L+ K + G+ KG + 

UNI - - VNMYVTDYIGLKIiDYKNGDKEUCLDKDMFEC^GIADDKHGNVDTIFV 299 



+L 



++ + +G N+V 



KD+ 



E D+ GN M L 



+K 



DYN+ 



L+ 



L+T+* N+ ♦ 1+ 
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Sbjct: 359 IQVRGSLGVSNKVAYSIQDYNAGGS LSGGDRLTAS ----- LDTSLINNNPNDIAI I - 409 

Query: 391 NNGILGQSQQANRQ- -KNAESQLITNRIDNVLNGSDPKSRFYDAVSVASNLSP 441 

N L Q N+ +N +S ++ N I +L G A ■*■ A SP 

Sbjct: 410 - NDYLS A YLQGNKNS LENQKSS I LFNG I VGMLGGG VSAGASAVGRS P FGLAS S V 462 

Query: 442 TAliFGKFNEEYNFYKCXXJAEYKDLAIiQPPS VTESEMGNAFQI ANS INGLTMKI S VPS PKB 501 

T + ♦ QA+ D+A PP +T+ AP N G+ + + 

Sbjct: 463 TGMTSTAGNAVLDMQALQAKQADIANI PPQLTKMGGKTAFDYGNGYRGVYVI KKQLKAEY 522 

Query: 502 ITFI^KYYMLFGFEVNDYNSFIEPINSMTVOmiKCTGTYTIRDIDPm^QLKAILESG 561 

L ++ ♦ + NY++ + DI+ I ++G 

Sbjct: 523 RRSLSSFFHKYGYKINRVKK--PNLRTRKAYNYIQTKDCFISGDINNNDLQEIRTIFDNG 580 

Query: 562 VRFWHNDGSGNPMLQNPL 579 

+ WH D GN ++N L 
Sbjct: 581 ITLWHTDDIGNYSVENEL 598 



>gi| 1429238 |emb|CAA67657| (X99260) tail protein [Bacteriophage B103] 
Length =598 

Score =77.6 bits (188), Expect = 2e-13 

Identities « 130/623 (20%), Positives = 240/623 (37%), Gaps = 86/623 (13%) 

Query: 5 TNFKFETNTPFT-DYQNTIHFNSFnCERDDYFLNGRHFKSLDYSKQPYNFI RDRMEIN 60 

T+ + FN PP+ DY++T F + + YF + K + NP+ I 

Sbjct: 9 TDVRI FSNVPFSNDYKSTRWFTNADAQYS YF NAKPRVHVINECNFVGLKEGTPHIR 64 

Query: 61 VDMQWHDAQG I NYMTFLS - D FEDRRYYAFVNQI BYVNDVVVTCI YFVIDTI MTYTQGNVLE 119 

V+ ♦ D YM F + + +Y FV ++EYVN V +YF ID I T+ + 

Sbjct: 65 VNKRIDDLYNACYMI FRK^QYS^^CWFYCFVTRLEYV1ISGVTNLYFEIDVIQTW- MFDFKP 123 

Query: 120 QI^NVNIERQHLSKRTYNYFUjPMLRNNDDVLKVSNKOT 179 

Q S + E Q + P+ D+L + V Q ++F S 

Sbjct: 124 QPSYI VREHQEMWDANNE PLTNTIDEGLNYGTEYDWAVEQYKPYGDLMFMVCISKS 180 



Query: 
Sbjct ; 
Query: 



180 KKFGTKKE PNLDTSKGT I YDN I TS PVNLYVMEYGDFINFMDKMSAYPWITQNFQKVQ 236 

K T E G I NI P++ YV + + D S P +T +VQ 

181 KMHATAGET- - - FKAGEIAANINGAPQPLSYYVHPF YEDGSS- - PKVTIGSNEVQ 230 



237 ML- PKDFINTKDLEDVKTSEKITGLKT- 
P DF+ ++ + ++ T 



- LKQGGKSKEWSLKDLSLSFSNL - 
+K SL+D 



284 



Sbjct: 231 VSKPTDFLKNMFTQEHAVNNIVSLYVTDYIGLNI 290 

Query: 285 - QEMMLSKKDEFKHMIRNEYMTIEFY DWNGNTMLLD AG K 322 

+E + +F NE + Y D+- GN + + 

Sbjct: 291 PNVNTIYLKEVKEYEEKTIDTGYKFASFANNEQSKLIiMYPYCVTTITDFKGNQID 350 

Query: 323 I SQKTGVKLRTKSI IGYHNEVRVYPVDYNS AENDRPILAKNKEILIDTGSFLNTNIT 379 

++ + +K++ ♦ +G N*V DYN+ D+ + A NT++ 
Sbjct: 351 VNG - SNLKI QVRGSLGVSNKVTYS VQDYNADTTLSGDQMLTAS CNTSLI 398 

Query: 380 FNSFAQVPILINNGILGQSQQANRQ- -KNAESQLITNRIDNVLN GSDPKSRFYDAVS 434 

N+ V 1+ N L Q N+ +N + ++ N + ++L G+ + AV 
Sbjct: 399 NNNPNDVAII - -NDYLSAYU^KWSLENQKDSILFNGVMSMLGNGIGAVGSAATGSAVG 456 



Query: 



435 VASNLSPTALFGKFNEEYNFYKQQQAEYKDLALQPPSVTESEMGNAFQIANSINGLTMKI 494 
VAS S T + + QA+ D+A PP + + A+ N G+ + 

Sbjct: 457 VAS - - S ATGMVS S AGNAVLQI QGMQAKQ ADIANT P PQLVKMGGNTAYD YGNG YRGVYV I K 514 

Query: 495 SVPSPKEITFl^KYY^FGFEVNDYNSFIEPINSMTVCNYLKCTGTYTI 554 

+ h ♦ +G++ N + + ♦ NY++ I +++ ++++ 

Sbjct: 515 KQ I KEEYRNI LSD FSRKYG YKTNLVK - - MPNLRTRES YNYVQTKDCNI I GNLNNEDLQKI 572 

Query: 555 KAILESGVRFWHNDGSGNPMLQN 577 

+ I +SG+ WH D G+ L N 
Sbjct: 573 RTIFDSGITLWHADPVGDYTLNN 595 



>gi | 215339 (M12456) p9 tail protein (Bacteriophage phi-29] 

>gi | 224163 | prf | | 1011232C protein p9.tail {Bacteriophage 

phi-29) 

Length « 335 
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Score = 71.0 bits (171), Expect = 2e-ll 

Identities * 64/293 (21%) , Positives = 123/293 (41%) , Gaps = 20/293 (6%) 

Query: 292 KDEFKHMI RNEYMTI EFYDWNGNTMLLDAGK I SQKTGVXLRTKS I IG YHNEVRVYPVDYN 351 

KD+ ++ Y E D+ GN M L 1+ +K++ + +G N+V DYN 
Sbjct: 57 KDQES KLMMYP YCVTE ITDF KGNHMNLKTEY INNS K - LKI QVRGS LGVSNKVAYS VQDYN 115 

Query: 352 SAENDRPILAKNKEILIDTGSFLNTNITn^SFAQVPILINNGILGQSQQANRQ--KNAES 409 

+ D + N+ S +N N I I N L Q N+ +N +S 

Sbjct: 116 A DSALSGGNRLTASLDSSLINNNPN DIAILNDYLSAYLQGNKNSLENQKS 165 

Query: 410 QL I TNR I DNVLNG SD PKSR FYDA VSVASNLS PT AL FG KFNE EYN FYKQQQ AEYKDLA 466 

++ N X ++ G + + A-*- +AS++ T + + QA+ D+A 

Sbjct: 166 SILFNGI MGMI GGG I SAGASAAGGS ALGMASS V - - TGMTSTAGNAVLQMQ AMQAKQAD I A 223 

Query: 467 LQP PSVTES EMGNAFQ I ANS INGLTMKX SVPS PKE ITFLQKYYMLFG FEVNDYNSFI E PI 526 

PP +T+ AF N G+ + + L ♦+ +G+++N ♦ 

Sbjct: 224 NIP PQLTKMGGNT AFD YGNGYRGVYV I KKQLKAE YRRS LS S F FHKYG YKINRVKK - - PNL 281 

Query: 527 NSMTVCNYLKCTGTYT IRDIDPMLMEQLKAI LESGVRFWHNDGSGNPMLQNPL 579 

+ NY++ + DI+ ♦++++ I ++G+ WH D GN ++N L 
Sbjct: 282 RTRKAFNYVQTKDCF I SGDINNNDLQ £ I RTI FDNG ITLWHTDNI GNYS VENEL 334 



>gi| 1181968 |emb|CAA87738.1| (Z47794) tail protein [Bacteriophage 
CP-1) 

Length o 230 
Score =53.9 bits (127) , Expect » 3e-06 

Identities « 29/113 (25%), Positives « 54/113 (47%), Gaps « 3/113 (2%) 



Query: 1 MRKLTNFKFFYNTPF-TDYQNTIHFNSNKERDDYFLNGRHFKSLDYSKQPYNFIRDRMEI 59 

M++ T + +PF DY N I+F + + +D+F + Y + + + I 

Sbjct: 1 MQESTKI WLYAKS PFKNDYANVTNFETRESMEDFFTKKNPHI EI VYEYDKFQYTQRNGSI 60 

Query: 60 NVDMQWHDAC^ INYMTFLSDFEDRRYYAFWQI EYV1IDVVVKI YE^ DTI 112 

V + + + YM F+++ R YYAFV + Y+N+ +1 ♦ +D TY 
Sbjct: 61 WSGRVTSKYENVTYMRFINN- -GRTYYAFVFTJVLYINE^ 111 



>gi| 1181970 |emb|CAA87740.l| (Z47794) tail protein (Bacteriophage 
CP-1) 

Length = 586 
Score » 42.2 bits (97), Expect = 0.010 

Identities = 79/381 (20%), Positives = 139/381 (35%), Gaps « 92/381 (24%) 

Query: 277 LSLSFSNLQEMMLSK- - KDEFK HMI RNEYMTI EFYDWNGNTMLLDAG KISQKT 327 

L ♦++ +QE + S KD+ + ++ +E+ IE YD GN+ + I + 

Sbjct: 187 LKIAYIXJIQEGLRSYMGKDDLEIEVQLLNSEFTEIELYDIYGNSYVYQPQYLPRTIDEAH 246 

Query: 328 GVKLRTKS I IGYHNEVRVYPVDYNSAEN DRPIL-- - --- 360 

K+ +G N+V + ++YN+A N D+ IL 

Sbjct: 247 KYKVIVSGSIX3DSNQVHINFLEYNNANNVSYADKNILDSLESGDWAEHNPEHFKYGLNDV 306 

Query: 361 - AKNKE I L IDT - GSFLNTNI TFNS FAQVP I LINNG I LGQSQQ ANRQKNAES QL ITNR IDN 418 

K+ IL D S++ ♦+ Q+ N +L QS + ++ A + + 

Sbjct: 307 TGKSVAILNDAEASYIQSHKNQMEHTQLTFKENRDMLKQSVDLSNKQVATANSQASYNAQ 366 

Query: 419 VLNGSD PKSRFYDAVSVASNLS PTALFGKF NE EYN FYKQQQ - - 459 

S +++ + S N++ L G F N +YN QQ 

Sbjct: 367 FAVDSANI N QWT EGASG I LNV AGNLLTGN FGG ALGGLASGGMKVFN ANRD YND KVVQQG F 426 

Query: 460 - AEYKDLALQPPS VTES EMGNAFQI ANS IN 488 

A DL QP SV + AFQ N + 

Sbjct: 427 TS ENN ALKSQ SNALANMKS K I ALDQS I RAYNATMADLQNQPI S VQQIGNDLAFQSGNRLT 486 

Query: 4 89 GLTMKISVPSPKEITFLQKYYMLFGFEVinDY-NSFIEPINSMTVCNYLKCTGTY 545 

+ K+S+ +■»■ +Y +GVN + N + +S NY+K T+R 
Sbjct: 487 D VYWKVS LAQKE I MGRANEY I KCYG VLVNWFTNDALS VMRS RKRFNYI KM I NVNLGTLR - 545 

Query: 546 I D PMLMEQLKAI LESGVRFWH 566 

+ M ++AI +SGVR W+ 
Sbjct: 546 ANQSHMNA I Q A I FQSG VR I WN 566 



! 



WO 00/32825 



PCT/1B99/02040 



294 



Query- pt j 110875 44AHJDORF005 Phage 44AHJD ORF | 12643-13890 | -1 1 
(415 letters) 

>gi | 3845203 (AE001399) GAF domain protein (cyclic nt signal 
transduct.) [Plasmodium falciparum] 
Length = 1245 

Score = 52.3 bits (123), Expect = 6e-06 

Identities = 59/246 (23%). Positives = 105/246 (41%), Gaps = 27/246 (10%) 

Query: 174 ESIDRNHGNVDYIGFPKMFLLGNAVOTSSPILSNL 233 

+S D N+ N + + N+V FS+ N IY++L N +YK + E+ 

Sbjct: 854 DSSDNNNNNNNNNNNNNNYNNNNSVIFST NEKIYDML NRDNIYKKVKKEIF 904 

Query: 234 RNDYVNEKRmTtoFNSNDDAMTTGEFEFNEYNI^ 291 

D + + + + M +NN ++N+ N+ N NGD Y KY 

Sbjct; 905 EGDS 1 1 KTMENKPNLTNKNYMNNDN I DNNNNNNNNNN I DNNNNNNGDN I YNDD LKKYYLN 964 

Query: 292 KVMYNVTTFMTMI I WPYTKQYEFCTKIR - DIDNHVTYLRDDMFYKENMERYYYNPSNLH 350 

++N ++ + + + KE K+ 1 + L +F+K NM ♦ + L+ 

Sbjct: 965 TSIFNKDLYVKHFVDIIMNKSLEE1IKMNVYISERINSL LFHKGNM LNDVTKLY 1018 

Query: 351 FDNA YS KNYVVDNDR YL YLDMNK I 1 KFHI KNEMKKNMS E FERKEKI YEDN YIENTK 406 

NAY + N K I F + E K +M F+ +KIY+ N + N K 

Sbjct: 1019 MSNAYGEKCFFFN FPQIKE 1 1 FVNEYEKKMDMKYFKMXJCKI YKYNLNKI FSNNYK 1073 

Query: 407 KYLMKQ 412 
+++K+ 

Sbjct: 1074 FFIIKK 1079 



>gi|3758843|emb|CAB11128.l| (Z98551) predicted using hexExon; 

MAL3P6.23 (PFC0820w) , Hypothetical protein, len: 4982 aa 
[Plasmodium falciparum] 
Length - 4981 

Score =49.2 bits (115), Expect = 5e-05 

Identities = 67/287 <23%) f Positives = 110/287 (37%), Gaps » 60/287 (20%) 

Query: 127 ITDIJlSATDIJCYHSNFIiKHYPIIIYDEFLALEDDYLIDF^KIJCT IYESIDRNHGK 182 

I D+N + D+ + +♦+ I YD +++DK++ IY +ID++ N 

Sbjct: 3619 IMDINKSKDISKNMEIVQN IEYD NKYDKI RNDMDAI YMAI DKDMDN 3664 

Query: 183 VDYIGFPKMFLLGNAVNFSSPILSNLNIYNL LQKHKMNTSRLYKNI FLEMRRND YV 238 

+ 1 +FL NS +N YNL ++KNRYNF +D 
Sbjct: 3665 IG I INCMRYFNLYKNYNNLSNECNNRE - YNLNELYMEDI KRNMKR - YDNNFNINHYDDNN 3722 

Query: 239 NEKRNTRAFNSNDDAMTTGEFEFNEYNIJU3DNIJ*NHINQN^ 298 

N N N+N++ N N ++N N+ N NG F+ D 
Sbjct: 3723 NNNNNNNNNNNNNNNNNNNNNNN^ -- 3771 

Query: 299 TFMTNIIVVPYTKQYEFCTKIRDIDNHVTYIiRDDMFYKENMERYYYNPSNIiHFD 358 

K FCTK ++F +N+E N N N Y+ N 

Sbjct: 3772 - KDLFFCTK KMI FPCKNIETVCKNEYNKKIYNNYTCN 3807 

Query: 359 YWDNDR YLYLDMNKI I KFHI KNEMKKNMS EFERKEK- 1 YEDNYI EN 404 

V+N + ++IK + + N E+ + EK+Y + EM 

Sbjct: 3808 ISWNTLNCIjNIIKELIKIjNIWKKKJIJrre 3854 



Score = 35.6 bits (80), Expect =0.70 

Identities = 62/290 (21%), Positives = 121/290 (41%), Gaps = 65/290 (22%) 

Query: 2 VKQNRLDMVRDYQNAVN- - HVRKKIPDKYNQI ELVDELMNDDIDYYISISNRSDGKSFNY 59 

+K+N ++ +N +N +V++ DK N I D++I + SN + +SF 

Sbjct: 4445 IKRNNINKSNIKRNNINKSNVKRSNTDKSNVIS DFHIT-SNNNITRSFT- 44 92 

Query: 60 VSFFIYIJVIKLDIKFTLLSRHYTLRDAYRDFIEEIIDENPLFKSKRVTFRSARDYLAIIY 119 

A 0 F LS TL +Y +F + + I 
Sbjct: 4493 ATLTDS I FNTLS E - - T LNYS YDNFFSNMDN IKI 4523 

Query: 120 QDKEIGVITDLNSATDLKYHSNFLKHYPIIIYDEFL -ALEDDYLIDEWDKLKTI YE 174 

+ EI ITD++ +YH N+LK ♦ +E++ + +D + DE ++T+ E 

Sbjct: 4524 KKNEINNITDVDYGNKKEYHENYLKVKQNKWEEYIEET 4583 
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Query: 175 S - - IDRNHGNVDYIGFPKMFLIX^AVNFSSPILSNLNIYNLLQIGUCMN- -TSRLYKNIFL 230 

SIN N+D + + + S P N++ N ++K+ +N R+ KN 

Sbjct: 4584 SCNISENISNID - - MDDEDH I S FPNG RNVHDNNYMKKNHVNYD KMR VGKN K I P 4634 

Query: 231 EMREND YVlTOKRNTRAFNSNDDAMTTGEFEFNEYNIiADDNLRlJH INQNGD 280 

D + +++ + +D M++ E ++ + L + NG+ 

Sbjct: 4635 SFTHFDKILDEKKKK SDKDMSSSKWLEREEHI KEI KLEKNEYMNGN 4680 

Score =34.0 bits (76), Expect =2.0 

Identities = 47/211 (22%), Positives = 84/211 (39%), Gaps e 32/211 (15%) 

Query: 210 I YNLLQKHKMNTSRLYKNI FLEMRRND YVNEKRNTRAFNSNDDAl^^ 269 

I++LLQK LY+N+ + R ♦ N+ T E ++ + ++ 

Sbjct: 918 IFSLLQKDSSPLLVLYENVHI REGEKYGRNE- - ATDNEVDYKKGDI I KH 964 

Query: 270 NLRNHINQNGDFFYI KTD DKYI KVMYNVTTFMTNI I WP YTKQYEFCTKI RDIDNHV 326 

N+ N + D + D+ K MY + V E K D+ N+ 

Sbjct: 965 NVTNEHGNHSDSYPYGNSI2IIjDRKPKNMYE-DIYKEKGFVKSDCSNIEI - -KKNDMINND 1021 



Query: 327 TYIJlDDMFYKENMERYYYNPSNLHFDNAYSKhreVVDNDRYLY I KFHIKNE 382 

Y «►++ FY+++ Y+ + YV++ +YL +N ++ F +KN+ 

Sbjct: 1022 VYKKNE - FYEDSRINMI YDEDEI KTWFLI PHKYVIN - - - 1 IYLFLNI LLTDESNFKLKNK 1077 

Query: 383 MKKNMSEFERKEKIYEDN YIENTKKY 408 

E K IYEDN ++N KKY 

Sbjct: 1078 KYGYFVMEETKGT I YEDNNGLQE I LKNGKKY 1108 

Score - 33.6 bits (75), Expect = 2.7 

Identities * 42/198 (21%), Positives * 77/198 (38%), Gaps = 42/198 (21%) 

Query: 222 SRLYKNI FTJSMR RNDYVNEKRNTRAF NSNDDAMTTGE FEFNEYNLA 267 

S LY I++ + +N + K+NT + N+++D TT E + + 

Sbjct: 411 SVLYSI I YMNKKYKKKNFI ITNKKNTNVYFENDVI QLS VENTS EDTFTTNTRES SLNSGM 470 

Query: 268 DDNLRNH I NQNGDFFYIKTDDKYI KVMYNVTTFMTNI I WPYTKQYEFCTKIRDIDNHVT 327 

+++R +N D +DDK ++Y N YTK E 
Sbjct: 471 MNDMRYSVNNYADEKVYHSDDKSDHLIYKHVHDEKNKYDEMYTKTKE -- 517 

Query: 328 YLRDDMFYKENMERYYYNP SNLHFDNAYS KNYWDNDRYLYLDMNKI I KFH I KNEMKKNM 387 

+++ YK N+ + N K LD+ K I H+KN+ + N 

Sbjct: 518 - -NENIIYKSNIVDKKTCDISSEMVNGKDK LDVEKYIGSHVKND-ENNK 563 

Query: 388 SEFERK- EKI YEDNYIEN 404 

+ ++K + + + YI+N 
Sbjct: 564 EKLKKKIDNVNKKEYIDN 581 

>gi| 3845297 (AE001421) hypothetical protein (Plasmodium falciparum) 
Length = 2380 

Score = 48.0 bits (112), Expect » le-04 

Identities = 87/390 (22%), Positives = 160/390 (40%), Gaps = 65/390 (16%) 

Query: 20 VRKKIPDKYNQIELVDELMNDDIDYYISISNRSDGKSFNYVSFF IYLAIKLDIKF 74 

♦++K +K ♦+ + +N D + ++ R K+ NY++ +YL I DI 

Sbjct: 1049 LQRKNMNKCSKNRNRNRYINKDSNIHLMNLIRIKFKNLNYMNMNSFEIELYLKINNDIFL 1108 

Query: 75 TLLSRHYTLRDAYR DFIEEIIDEN-PLFKSKRVTFRSARDYLAIIYQDKEIGVI 127 

+Y +++ Y + + «f EN + +♦+ ++ + Y +K+ 

Sbjct: 1109 QFNKHNYNVQNFYNFSITLINIMSKYYSENFYAYNLEKIVYKFLUJNKNFEYIEKQYSS^ 1168 

Query: 128 TDLNSATDLKYHSNFLKHYPI I IYDEFLA LEDDYLIDEWDKLKTIYESIDRNHGNV 183 

D+N D+ +K+ II EFL L+ D I + KLKT ++ 
Sbjct: 1169 EDMNEL-DILVNTYDMKYDKII EFLXNNGYLKIDRYIYFYPKLKT DI 1214 

Query: 184 D Y I G F P KMFLLGNA VNFS S P I LSNLN I YNLLQ KHKMNTS RL Y KNIF- -LEMRRN 235 

F ++FL N + L NI +♦+ K + Y K IF + M+ + 

Sbjct: 1215 I LFFFKEI FLNDN I LXI DRKFLKK - N ITIMI E VLKEI FFKEYVKRC ITKVI FF PVHMKEH 1273 

Query: 236 DYVNEKR NTRAFNSNDDAMTTGEFEFNEYNLADDNLRNHINQNGDFF^I KTD 287 

D+V K N+ FN* D + N YN D+ N+ N N +Y K 
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Sbjct: 1274 DHV^KNYYNNQYVNNSNMFNTRCaDHNNNN^ 1332 

Query: 288 DKYIKVMYNVTTFMTNI IV- - - VPYTKQYEFCTKIRDIDNHVTYLRDDMFYKEN ME 340 

+K K+MY ■►+♦ + VK+KI+Y+++ N + 

Sbjct: 1333 NKN-KIMYEKERKSSSLFISNNVQDVKPIIOryLKYSSiyKNFIYIISEIKNFNNKITKIN 1391 

Query: 341 RY- YYNPSNLHFDNAYSKNYWDNDRYLYL 369 

RY YYN NL+ D+ ND YL+L 

Sbjct: 1392 RYNYYNYMNLNIDDL NDAYLFL 1413 



Score = 32.5 bits (72), Expect * 6.0 

Identities » 46/183 (25%). Positives » 73/183 (39%), Gaps » 26/183 (14%) 

Query: 225 YKNIFLEMRRNDYWEKRNTRAFNSNDDAMTTGEFEF^ 284 

+KNI ++ ++N + NSN ♦ + N N+ +N N IN + I 

Sbjct: 27 H KN I NKN I KNKKFI HIDNSNNCNNS1IS1INSNSNNNNNNNNN I VRNN - NNF IN ADKKKNVI 85 

Query: 285 KTDDKYIJCVMYNVTTFMTNIIVVPYTKQYEFCTKIRDIDNHV^ 344 

+D IK V NI Y ++ + D+ N+ + + KE ER 
Sbjct: 86 LNEDDD I KNKELVDESFVNI FP- - YENYFKNLFNLNDVSNNKVI - - NI I EQKEGDER 138 

Query: 345 NPSNLHFDNAYSKNYWDNDRYLYLDMNKI IKFHI KNEMKKNMSEFT2RKEXI YEDNYIEN 404 

N N N +KN V DN +NK IKN +N++E Y N++ + 

Sbjct: 139 NADN NLKNKNIVRDN INK I KN- - TRNVNEI LI YNNKYI INFLND 180 

Query: 405 TKK 407 
T K 

Sbjct: 181 TTK 183 



>gi|44 93936|emb|CAB38972.l| (AL034556) predicted using hexExon; 

MAL3P5.6 (PFC0600w), Hypothetical protein, len: 250 aa 
[Plasmodium falciparum} 
Length « 249 

Score = 47.3 bits (110), Expect = 2e-04 

Identities = 53/215 (24%), Positives * 87/215 (39%), Gaps = 30/215 (13%) 

Query: 209 NI YNLLQKHKMNTSRLYKNI FLEMRRNDYVNBKRNTRAFNSNDDAMTTGEFEF - -NEYNL 266 

NIYN L++ YKN N ++ +N N+N EFE N YN 

Sbjct: 13 NIYNKLEEK YKNFLKLKNMNSHMGASQNMNV - NNNYTMNELEEFEKI NNNYNN 64 

Query: 267 ADDNLRNH I NQNGD FFYI KTD DKYIKVMYNVTTFMTNI I WPYTKQYEFCTKIRD 321 

++N+ N+IN D+ IK +K ++ YN +1 T +++ 

Sbjct: 65 NNNNINNNINNYYDYMNIKVSQSVQHNICRIXP^ 124 

Query: 322 I DNHVTYLEDD M FYK ENMERYYYNPSNLHFDNAYSKNYVVDNDRYLYLDMNKI I K 376 

++ + Y RD+ K EN + N + N+ S NY DN+ LY +N++ K 

Sbjct: 125 LEKRLA YE RDNTLI KNI QEE ENKKG I G I NGN FG S E SNS SS S NY - - DNNYLL YRK INRLNK 182 

Query: 377 FHIKNEMKKNMSEFERKEKIYEDNYIENTKKYLMK 411 

+ ++ KI KKY++K 
Sbjct: 183 TNTNKSKNRSRKRKRINSKI DKKYIIK 209 



>gi| 3845165 (AE001390) hypothetical protein (Plasmodium falciparum) 
Length =1247 

Score = 45.7 bits (106), Expect =* 6e-04 

Identities « 52/239 (21%), Positives » 94/239 (38%), Gaps » 38/239 (15%) 

Query: 206 SNI^IYNLLQKHKMNTSRLYKNIFLEMRRNDYVNEKRNTRAFNSNDDAMTTGEFEFNEYN 265 

+N N +N ++K K R I +N + +N ++N+D EN N 

Sbjct: 474 NNTNKWNEI KKRKKKFKREKNKI INNSFQNQEAEDDKJJNNNNDNNNDNHNDNNNENNNEN 533 

Query: 266 IJu^DNLRNHINQNGDFFYI-KTDDKYIK VMYNVTTFMTNIIWPYTKQYEFCTKIR 320 

D+N N+ + N D I D+ Y +YN T ♦+ YTK ♦ + + 

Sbjct: 534 NNDNNNENNNDINNDINNIHNNDNNYYNND S92 

Query: 321 DIDNHVTYLRDDMFYKENME RYYYN PSNLHFDNAYS 356 

+ + ++ > FY++N + ++YYN + N 

Sbjct: 593 DMLPSI KFETFYEKNTTJHKNFNENYKFYYNTDDDTDI INAIKKKNVKNK^ 649 



Query: 357 KNYVVDNDRYLYLDMNKIIKFHIKNEMKKNMSEFER KEKIYEDNYIENTKKYLMK 411 

KNY+ N+ Y YL+ N+ + I * K +E K+ 1+ ++Y E K K 
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Sbjct: 650 KNYI NHNE - YSYLE YNENKNYEI NKKEKLLTENYE YDMYI KDN I HYND YSEGDGKQTKK 707 

tt££'l S&i^rSi;^ 11 - 96/245 (38%), Caps - 43/245 (17%, 
Query: 207 r™iYNLItfKHKMOTSR^ 266 

Sbjct: 564 SSwS^^ 623 

Query 267 ADD NLRNHINQNGDPF YIKTDDKYIKVMYNVT-TPMTNIIVVPYTKQ 312 

Query. ^ YI ++ Y + YN ♦ N T+ 

Sbjct: 624 DDDTD I INAI KKKNVKNK - KKNGNI VI KNY INKNE - YSYLE YNENKNYEINKKEKLLT EN 681 

Query: 313 YEFCTKIRDIDNHVTYI^DMFYKENMERYYYNP^ ^ 

YE+ I+D ++ Y D + + YN +N +N Y K +Y+ vu 

Sbjct: 682 YE YDMY I KDN I HYND YS EGDGKQTKKAS S F L YNNNN - - -NNKYKKEDNKTQIISYMDHVD 738 

Ouerv- 363 NDR-- YLYLDMNKI IKFH IK- NEM KKNMS E FERKEKI YEDNYI ENTKKY 408 

Query. 363 NDK y + + ++ F + K N+M K+ F +E I + +EN K + 

Sbjct: 739 NENGVKGUCICRNLFTfNNSDQLYNFDVKD 798 

Query: 409 LMKQY 413 

L K Y 
Sbjct: 799 LKKHY 803 

Query* pt|110877 44AHJDORF007 Phage 44AHJD ORF |2044-3027|l 1 
(327 letters) 

>gi | 1181960 | emb|CAAe7731.l| (Z47794) connector protein 
[Bacteriophage CP-lJ 
Length = 337 

Score *= 45.7 bits (106), Expect » 5e-04 tn9t s 
Identities . 44/184 (23ft), Positives = 84/184 (44%), Gaps » 13/184 (7%) 

Query: 127 qihklwhcmso^^ 184 

++HK + + +V+ N Y I +E + ++LA++ L+ L A+ + IF 
Sbjct: 125 E U*KDNPDKIKRPCIVIPNNNF-YE^ 182 

Query: 185 KSEINDESINQLVSEIYNGAPFVKMSPMFNAD DD I IDLTSNS VI PALTEMKR 236 

N S+- + ++I N P V ++ + D D I + L ++ 

Sbjct: 183 AD*nTJVl£KKNIFNXIANFEPVVYLNKQI 242 

Query- 237 EYQNKISEI^NYI/SINSIiAVDKESGVSDEEAKSNRGFTTSNSNI YLKGREP - ITFLSKRY 295 

E ++GIN+ DK+ + EA SN G ++N + K R + ++K Y 

Sbjct: 243 EKLRVMNQLLTFIGINNNPSDKKERLVVSEAISNNGVI SANI EVGWKSRRKFVELINKCY 302 

Query: 296 GLDI 299 
GL+I 

Sbjct: 303 GLEI 306 

>gi|1429239|emb|CAA67658| (X99260) upper collar protein 
(Bacteriophage B103] 
Length « 308 

Score = 44.9 bits (104), Expect « 8e-04 n/ie , y 

Identities - 40/159 (25%), Positives * 73/159 (45%), Gaps = 11/159 (6%) 

Query- 150 YNSDIEI lEHYTDELAEVA^SRFSLIMQAKFSKIFKSEINDESINQLVSEIYNG 203 

YN+D++ +E + +LAE+ + + Q I ++ N S+ + ++ 

Sbjct: 121 YNNDLKCSTLPALEMFAQDLAELKEI I AVNQNAQKTP VLI AANDNNQLSLKNI YNQYEGN 180 

Ouerv 204 APFVKMSPMFNADD - DI IDLTSNSVI PALTEMKREYQNKISELSNYLGINSLAVDKESGV 262 

AP.i ♦ D+ + ♦ V+ L K N E+ YLGI + ++K+ + 
Sbjct: 181 APVIFVTJESIjDLDNLKVFKTDAPYvVDKLNAQKNAVWN EVMTYLGIKNANLEKKERM 237 

Query- 263 SDEEAKSNRGFTTSNSNIYLKGR-EPITFLSKRYGLDIK 300 — ~" 

E SN S+ NIYLK R E +S* YGL++K 

Sbjct: 238 VTSEVDSNDEQIESSGNIYLKARQEACNKISELYGLNLK 276 

>gi|137915!sp|P07535|VG10 BPPZA UPPER COLLAR PROTEIN (CONNECTOR 
91 PROTEIN) (LATE'PROTEIN GP10) >gi | 75851 |pir| |WMBP10 gene 
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10 protein - phage PZA >gi| 216059 (M11813). upper collar 
protein [Bacteriophage PZA] 
Length = 309 

Score =43.8 bits (101). Expect = 0.002 

Identities = 38/160 (23%), Positives = 75/160 (46%), Gaps = 13/160 (8%) 



Query: 150 YNSDIEI IEHYTDEIjAEVALSRFSLIMQAKFSKIF- -KSEINDESINQLVSEIYN 202 

YN+D+ +E + ELAE+ S+ A+ + + ++ N S+ Q+ ++ 

Sbjct: 122 YNNDMSFPTTPTLELFAAELAELK- EI I S VNQN AQKTP VLI RANDNNQLSLKQ VYNQ YEG 180 

Query: 203 GAPFVKMS PMFNADD - DI IDLTSNSVI PALTEMKREYQNKI SELSNYLGINSLAVDKESG 261 

AP + ++D ++ + V+ L K N E+ +LGI + ++K+ 
Sbjct: 181 NAPV I FAKEALDSDS I EVFKTDAP YVVDKLNAQKNAVWN EMMTFLG I KNANLEKKER 237 

Query: 262 VSDEFJUCSNRGFTTSNSNIYLKGR-EPITFLSKRYGLDIK 300 

+ +E SN S+ ++LK R E +++ YGLD+K 

Sbjct: 238 MvTDEVSSNDEQIESSGTVFLKSREEACEKINELYGLDVK 277 



>gi| 137914) Bp |P04332|VG10_BPPH2 UPPER COLLAR PROTEIN (CONNECTOR 

PROTEIN) (LATE~PROTEIN GP10) >gi | 75852 |pir| | WMBPC9 gene 
10 protein - phage phi-29 >gi| 215328 (M14782) upper 
collar protein (Bacteriophage phi-29) >gi| 215340 
(M124S6) plO connector protein (Bacteriophage phi-29] 
>gi|22416l|prf | |1011232A protein plO, connector 
[Bacteriophage phi-29] >gi (225365 |prf| | 1301270E gene 10 
[Bacteriophage phi-29] 
Length = 309 

Score « 41.4 bits (95), Expect = 0.009 

Identities - 37/160 (23%), Positives « 75/160 (46%), Gaps * 13/160 (8%) 

Query: 150 YNSDIEI I EHYTDELAEVALSRFSLIMQAKFSKIF- -KSEINDESINQLVSEIYN 202 

YN+D+ +E + ELAE+ S+ A+ + + ++ N S+ Q+ ++ 

Sbjct: 122 YNNDMAFPTTPTLELFAAELAELK- E I ISVNQNAQKTPVLIRANDNNQLSLKQVYNQYEG 180 

Query: 203 GAPFVKMSPMFNADD - DI IDLTSNSVI PALTEMKREYQNKI SELSNYLGINSLAVDKESG 261 

AP + ++D ++ + V+ L K N E+ +LGI + ++K+ 
Sbjct: 181 NAPVIFAHEALDSDSIEVFKTDAPYVVDKLNAQKNAVWN EMMTFLG I KNANLEKKER 237 

Query: 262 VSDEEAKSNRG FTTSNSNI YLKGR - EP ITFLSKR YGLD I K 300 

+ +B SN S+ ++LK R E YGL++K 

Sbjct: 238 MVTDEVSSNDEQIESSGTVFLKSREEACEKINELYGLNVK 277 

Query* pt| 110878 44AHJDORF008 Phage 44AHJD ORF | 3020-3775) 2 1 
(251 letters) 

>gi|4 982468|gb|AAD30963.2| (AF118151) SNF1 /AMP- activated kinase 
(Dictyostelium discoideum] 
Length = 718 

Score =» 52.3 bits (123), Expect * 3e-06 

Identities * 28/118 (23%), Positives « 56/118 (46%), Gaps = 5/118 (4%) 

Query: 121 YLQSQGFTEHNEDTTSNTDETSNQNATSLDNSTGMTANRNAYV SLPQS EVN I DVDN 176 

+ + GF N +♦ SN + +N H + N+ T N N + ++ ♦ +N + +N 
Sbjct: 382 FTTTTGFNPTNSNSISNNNNNNNNNNNNTTNNNNNTT^ 441 

Query: 177 TTLRFADNNT IDNGKTVNKS SNESNQNAKRNQNQKGNAKGTQFTKQYLID - NIDKAYD 233 

+NN I+N N ++N +N N N N N+ + T+ + I N++ +Y+ 
Sbjct: 442 NNNNINNNNIINNNNNNNNNNNNNNNNNN^ 499 



Score » 37.5 bits (85), Expect » 0.094 

Identities = 17/111 (15%), Positives = 45/111 (40%) 

Query: 130 HNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQS E VNIDVDNTTLRFADNNTI DN 189 

+N + +N + +N N + +N++ ++ + P +++++* N+ ++ 

Sbjct: 456 NNNNNNNNNNNNNNNNNNNNNNNSS I SGGTEVFS I S PNLNNS YNSNSSGNSNGSNSNNNS 515 

Query: 190 GKTVNKSSNESNQNAKRNQNQKGNAKGTQFTKQ YLI DN I D KA YDLRKKI LN 240 

N +N +N N N N N ID+++ + + + N 

Sbjct: 516 NNNTNNDNNNNNNNNNNNNNNNNNNNNNNNNNNNCI 566 
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Score a 32.8 bits (73), Expect *» 2.4 

Identities « 31/140 (22%). Positives * S7/140 (40%), Gaps * 14/140 (10%) 

Query: 109 LNWYS SS EVEKYLQSQGFTEHNEDTTS NTDETSNQNATS LDNSTGMTANRNAYVS L 165 

LN Y+S+ S N +T + N + +N N + +N+ N N + 

Sbjct: 494 l^NNSYNSNSSGNSNGSNSNNNSNNNTNNDNNNNNNN^ 553 

Query: 166 PQSEVN- - IDVDNTTLRFADNNTIDNGKTVNKSS NESNQNAKRNQNQKGNAK 215 

+ +N DV+N+ + +NN D+G N ++ N N + N GN 

Sbjct: 554 VNN S LNNEND VNNS N I NNNNNNNSDDG SNNNS YEGGGD VLLLS D LNGNNQ LGGNDNGNVV 613 

Query: 216 GTQ FT KQ YLIDNI DKA YD LR 235 

Q L++++D D++ 
Sbjct: 614 NLNNNFQ- LLNSLDLNSDIQ 632 

Score « 31.7 bits (70) , Expect « 5.4 

Identities = 25/115 (21%), Positives » 48/115 (41%), Gaps = 10/115 (8%) 

Query: 130 HNEDTTS NTDETSNQNATS LDNST - - - GMTAN - RNA YVS L PQS EVN I D VDNTTLRFADNN 185 

+N + +N + +N N +S+ T ++ N N+Y S S N + N+ +N 
Sbjct: 462 NNNNNNNNNNNNNN1JNNSS ISGGTEVFSI SPNLNNSYNS - - NSSGNSNGSNSNNNSNNNT 519 

Query: 186 TIDNG KTVNKS SNE SNQNAKRNQNQKGNAKGTQ FTKQ YL I DNI D KA YDLRKK I LN 240 

ON N ++N +N N N N N + ++++ D+ +N 

Sbjct: 520 NNDN NNNNNNNNNNNNNNNNNNNNNNNNNN 570 

Score = 31.7 bits (70), Expect = 5.4 

Identities » 15/104 (14%), Positives ° 43/104 (40%) 

Query: 110 NVVYSSSEVISKYLQSQGFTEHNEDTTSNTDETSNQNAT^ 169 

N* +++ ♦ + +N + +N 4 +N N + +N+ + + V 

Sbjct: 434 NINNNNNNNNNNINNNNIINNNNNNtlNNNNNNN^ 493 

Query: 170 VNIDVDNTTLRFADNNTIDNGKTVNKSSNESNQNAKRNQNQKGN 213 

♦N ++ + ++ + +N N ++♦ +N N N N N 
Sbjct: 494 LNNSYNSNSSGNSNGSNSNNNSNNNTNNDNNNNNNNNNNNNNNN 537 

Score = 30.9 bits (68). Expect m 9.2 

Identities = 16/84 (19%), Positives = 34/84 (40%) 

Query: 130 HNEDTTSNTDETSNQNATSIJJNSTGMTANRNAYVSLPQSEVNIDvTJNTTLRFAD 189 

+N + +N + +N N + +N+ + S+ + N N++ +N+ +N 

Sbjct: 455 NNNNNNNNNNltfNNNNNNNNNN^ 514 

Query: 190 GKTVNKSSNESNQNAKRNQNQKGN 213 

+ N +N N N N N 
Sbjct: 515 SNNNTNNDNNNNNNNNNNNNNNNN 538 

>gi|l730077|sp|P18160|KYKl DICDI NON - RECEPTOR TYROSINE KINASE SPORE 
LYSIS A ( TYROS INE - PROTB IN KINASE 1) >gi| 974334 (U32174) 
non- receptor tyrosine kinase [Dictyostelium discoideum) 
Length « 1584 

Score » 46.5 bits (108), Expect = 2e-04 

Identities » 29/106 (27%), Positives = 48/106 (44%), Gaps = 4/106 (3%) 

Query: 130 HNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVNID VDNTTLRFADN -N 185 

+NED +SN + +N N + +N+ N N + + N + ++NTT N N 

Sbjct: 442 NNEDISSNNNNNNNNNNNNNNNNNNNNNNNNNNNNN^ 501 

Query: 186 TI DNG KTVNKSSNES NQNAKRNQNQ KGNAKGTQ FTKQ YL I DNI DKA 231 

+N N +SN +N N N N N TK+ I + D++ 

Sbjct: 502 NNNNNNNSNSNSNSNNNNINNNNNNNNNNNNI YLTKKPS IGSTDES 547 

Score =» 34.0 bits (76), Expect m 1.1 

Identities « 20/117 (17%), Positives = 46/117 (39%) 

Query: 87 NRQTVEAFGMQVITVCITHEDYUA^SSSEVEKYLQSQGFTEHNEDTTSN^ 146 
N G IT T + + + +N ♦ +N + +N N 
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Sbjct: 415 NNNNNNIIGNGKITTTTTTSTSPSSINNNEDISSNbTO 474 

Query: 147 TSLDNSTGMTANRNAWSLPQSEVNIDVDNTTLRFADNNTIDNGKTVNKSSNESNQN 203 

+ ++♦+ TNN + ♦ N + +N N+ +N N ++N +N N 

Sbjct: 475 NNNNSNSSNTNNNN I NNTTNNNNSNSNNNNN^ I NNNNNNNNNNN 531 

Score = 33.2 bits (74) , Expect = 1.8 

Identities a 18/88 (20%), Positives « 35/88 (39%) 

Query: 130 HNEDTTSNTDETSNQNATSIJ}NSTGMTANFNAYV^ 189 

♦N + ++N + +N N T T + S+ +E +N +NN +N 

Sbjct: 405 NNNNNSNNNNNNNNNNIIGNGKITTTTTTSTSPSS 464 

Query: 190 GKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

N ++N +N N+ + NT 
Sbjct: 465 NNNNNNNNNNNNNNSNSSNTNNNNINNT 492 

Score = 32.5 bits (72), Expect =3.1 

Identities » 18/94 (19%) , Positives - 37/94 (39%) 

Query: 120 KYLQSQGFTEHNEDTTSNTDBTSNQNATS LDNSTGMTAN RNAYVS L PQ S EVN I DVDNTTL 179 

K + S N + +N++ +N N ++ + +T S N D+ + 

Sbjct: 392 KNVNSTS I LVPNGNNNNNSNNNNNNNNNNI IGNGKI TTTni 'STS PSSI NNNEDI SSNNN 451 

Query: 180 RFADNNT I DNGKTVKKS SNESMQN AKRNQNQKGN 213 
+NN +N N ++N +N N + + N 

sbjct: 452 imnjmrnrnmrtnjmmmmmmijsiissimj 485 
Score « 32.5 bits (72), Expect - 3.1 

Identities = 24/110 (21%), Positives = 44/110 (39%), Gaps = 10/110 (9%) 

Query: 138 TD ETSNQNATS LDNS TGMTANRK A YVS L PQS EVN I D VDNTTLR FADNNT I DNGK 191 

T T++ + +S++N+ +++N N + + N + +N +NN N 

Sbjct: 429 TTTTTSTSPSSINNNEDISSNNNNKNNNNNNN^ 488 

Query: 192 TVNKSSNE SNQN AKRNQNQ KGNAKGTQ FTKQYLI DN I D KA YDLRKK 237 

T N +SK +N N N N N+ +N + L KK 

Sbjct: 489 INNTTNNNNSNSNNNNNNNNSNSNSN^ 538 

>gi|3758855|emb|CAB11140.l| (Z98551) predicted using hexExon; 

MAL3P6.11 (PFC0760C), Hypothetical protein, len: 3395 aa 
[Plasmodium falciparum) 
Length = 3394 

Score =46.5 bits (108), Expect = 2e-04 

Identities = 52/202 (25%), Positives = 96/202 (46%), Gaps * 32/202 (15%) 



F ++ ++ K T D+ M+K K D DV + NEK++ L ++L+ + + KK 



riHFLDREINRQTVEAFGMQV ITVCITHEDYLNVVYSSSEVEKYLQSQGFTEHNE 132 

♦H + IN* + + +QV IV + DY t S + + K ♦ +N 



+SN D +NQN +++N+ + N+N N +++N + N +N 

KSSNKD Y - NNQNNQN I ENNQN I ENNQN - - NQNIEN NQNIENNQNN 820 



Query: 


21 


Sbjct: 


665 


Query: 


78 


Sbjct: 


722 


Query: 


133 


Sbjct: 


778 


Query: 


193 


Sbjct: 


821 



N +N++NQN + NQN + NA 



Score ■ 33.6 bits (75), Expect -1.4 

Identities » 46/221 (20%), Positives = 89/221 (39%), Gaps - 37/221 (16%) 

Query: 10 DFIKSELIKXGFNEFVNDNKLTFYDDEFQFMQK^ 69 

D+KEK N ++LY++ M+K K + V K SL 

Sbjct: 367 DSUCIEYNKSKTNIQQLNEQLVNYKNFIKEMEKKYK QLWKNNSLFSITH 416 

Query: 70 DLLFKKS FT I H FLDRE I NRQTVEAFGMQVI TVC I TH EDYLNVVYSSSEVEKYLQSQG 126 
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D+K+I+R+ + + ♦+ ♦ I H +D+L+V+Y + + L + 
Sbjct: 417 DFINLKNSNI 1 1 IRRTSDMKQI FKMYNLD I EHFNEQDHLS VI Y IYEILYNTN 468 

Query: 127 FTEHNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSEWIDVDNTTLR 186 

+N D +N D +N N + +N+ N N N + +N + 
Sbjct: 469 - DNNNNDNDNNNDNNNNNNNNNDNNNNNNNDN^ --NNNYNNIMM M 512 

Query: 187 I DNGKTVNKS SNESNQNAKRNQNQKGNAKGTQFTKQYLI DN 227 

I+N + N ♦++ N + N N ♦ N + + +Y I+N 
Sbjct: 513 IENMNSGNHPNSNNLHNYRHNTOTENNLSSLKTSFRYKINN 553 



Score = 32.8 bits (73), Expect a 2.4 

Identities m 28/122 (22%), Positives = 53/122 (42%), Gaps = 2/122 (1%) 

Query: 119 EKYLQSQGFTEKNEDTTSNTDETSNQNATSLDNSTGOTANRNAYVSLPQSEWID-VDOT 177 

E Y S + +++ N + +N + + DN+ N N +N D ++N 

Sbjct: 2838 ENYP VSTHYDNNDD I NKDN INNDNNNDN I NDDNNNDN I NNDNNNDNI NNDN I NNDN I NND 2897 

Query: 178 TI^FACNNTIDNGKTVNKSSNESNQNAKRNQNQKGNAXGTQFTKQYLIDNIDKAYDLRKK 237 

+N+ +NG SSN N NNKN+G + + + + YD K 

Sbjct: 2898 NNNDNNNDNSNNGFVCELSSNINDFNNILNVN- KDNFQG INKSNNFSTNLS EYNYDAYVK 2956 

Query: 238 IL 239 
1 + 

Sbjct: 2957 IV 2958 



Score =32.5 bits (72), Expect =3.1 

Identities * 46/249 (18%), Positives « 101/249 <40%) , Gaps - 31/249 (12%) 

Query: 9 YDFIKSELIKKGFNEFWDNKLTFYDDEFQFMQKMLKTO 68 

Y-M.+K ++ N N NK E Q++ K+ + + + +E K L++ 

Sbjct: 21S0 YNYVK VQNATNREDNKNK ERNLSQEIYKYINENIDLTSELEKKNDMLENYK 2200 

Query: 69 SDL LFKKSFTIHFI^REINRQTVEAFGMQVITVCITHEDYLNVVYSSSEVEKYL 122 

++L ++K + I L + M+++ N + E+ + L 

Sbjct: 2201 NEUCEKNEE I YKLNND IDMLS NNC KKLKES I MMMEK YKI I MN NNIQEKDEIIENL 2255 

Query: 123 QSQGFTEHNEDTTSNTDETSNQNATSLDNSTGMTAN RNAYVSLPQSB VNIDV 174 

+++ + +D +N + ++S M+ + N + +L +S N+D+ 

Sbjct: 2256 KNK-YNNKI^DLINNYSVVDKSIVSCFEZ>SNIMSP 2314 

Query: 175 DNTTLRF ADNNT I DNGKTVNKS SNE SNQNAKRNQNQKGNAKGTQ FTKQ YL I DN I D KAYDL 234 

N + ++I+N +N +N +N N N N N K YL++N+ D 

Sbjct: 2315 CNENMDSI--SSINNVNNINNVNNINNVNNIN^ 2372 

Query: 235 RKKILNEFD 243 
1+ +F+ 

Sbjct: 2373 DNIIIIKFN 2381 



Score = 32.1 bits (71), Expect = 4.1 

Identities * 20/103 (19%), Positives = 48/103 (46%), Gaps = 2/103 (1%) 

Query: 115 SSEVEKYLQSQGFTEHNEDTTSNTDETSNQN- -ATSLDNSTGMTANRNAYVSLPQSEVNI 172 

++♦ EKY EH + N D +N+N L ++ ++ + N S ++E+ 

Sbjct: 3264 NNDEEJCYSCHDDKNEHTNNDLLNIDHDN^ 3323 

Query: 173 DVDNTTLRFADNNTI DNGKTVNKS SNESNQNAKRNQNQKGNAK 215 

+ + D N ++ N ++E+++N + ++N -f + K 
Sbjct: 3324 LISIDSSNENDENDENDENDENDENDENDENDENDENDENDEK 3366 



Score = 30.9 bits (68), Expect * 9.2 

Identities = 27/118 (22%), Positives = 53/118 (44%), Gaps s 15/118 (12%) 

Query: 104 THEDYLNWYSSSEV EKYLQSQGFTEHNEDTTSNTDETSNQNATSLDNSTGMTANR 159 

T+ D LN+ + E Y HN+D ++ +E QN S+D+S N 

Sbjct: 3280 TNNDLI2JIDHDNNKNNITDELYSTYNVSVSHNKDPSNKENEI - -QNLISIDSSNENDEND 3337 * 

Query: 160 NAYVSLPQSEVN I DVDNTTLRFADNNT I DNGKTVNKS SNESNQNAKRNQNQKGNAKGT 217 

N •*■ D D N ++ N +E+++N ♦ ++N N +GT 

Sbjct: 3338 EN DENDENDENDEN DENDENDENDENDEKDENDENDENDENFDNNNEGT 3386 
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>gi| 585795 |sp|P21538|REBl_YEAST DNA-BINDING PROTEIN. REB1 (QBP) 

>gi|626139|pir| |S45907 DNA-binding protein REB1 - yeast 
(Saccharomyces cereviaiae) >gi | 536280 | emb| CAA84 992 | 
(Z3S918) ORF YBR049C [Saccharomyces cereviaiae] 
>gi| 559944 |erab|CAA8639l| (246260) REB1 DNA-binding 
protein [Sac char omyces cereviaiae J 
Length » 610 

Score o 45.7 bits (106), Expect = 3e-04 

Identities = 34/158 (21%), Positives = 72/158 (45%), Gaps « 14/158 (8%) 

Query: 83 DREINROTVEAFGMQVITVCITHEDYLNVVYSSSEVEKYLQSQGFTEHNEDT^ 142 

D+ N+++VE ++ + V + H+++ +♦+ K+ ♦ Q E + D N ++ S 

Sbjct: 7 DIOJANQESv^EA VXKYVGVGLDHQNRl)PQliHTKDLENKHS KKQN I VES S SD VDVNNNDDS 66 

Query: 143 NQNATSLDNSTGMTANRNAYVSLPQS EVN I DVDNTTLRFADNNTID NGKTVNKS SNE 199 

N+N + D+S ++A L +E + +VD+ N +D N+ +B 

Sbjct: 67 NRNEDNNDDSENISA I2JANESSSNVDHANSNEQHNAV>1DWYLRQTAHNQQDDE 119 

Query: 200 SNQNAKRNG^QKGNAKGTQFTKQYLIDNIDKAYDLRKK 237 

++N N GN F++ ++ +D D KK 

Sbjct: 120 DDEN- -NNNTDNGNDSNNHFSQSDIV- - VDDDDDKNKK 153 

>gi | 172372 (M58728) DNA-binding protein (Saccharomyces cerevisiae] 
Length = 809 

Score = 45.7 bits (106), Expect • 3e-04 

Identities => 34/158 (21%), Positives = 72/1S8 (45%), Gaps = 14/158 (8%) 

Query: 83 DREINRQTVEAFGMQVITVCITHEDYLNVVYSSSEVEKYL^ 142 

D+ N+++VE ++ + V + H+++ +++ K+ + Q E + D N ++ S 

Sbjct: 7 DKNANQESTOEAVXKYVOTGLDHQNHDPQIOTKDLEN 66 

Query: 143 NQNATSLDNSTGMTANRNAYVSLPQSEVN I DVDNTTLRFADNNTID NGKTVNKS SNE 199 

N+N + D+S ++A L +E + +VD+ N +D N+ +E 

Sbjct: 67 NRNEDNNDDSENISA - LNANES S SNVDHANSNEQHNAVMDWYLRQTAHNQQDDE 119 

Query: 200 SNQNAKRNQNQKGNAKGTQFTKQYLIDNIDKAYDLRKK 237 

++N N GN F++ +♦ +D D KK 

Sbjct: 120 DDEN- -NNNTDNGNDSNNHFSQSDIV- -VDDDDDKNKK 153 

>gi | 2952545 (AF051898) coronin binding protein [Dictyosteliura 
discoideumj 
Length = 560 

Score » 44.9 bits (104), Expect = 6e-04 

Identities = 26/83 (31%), Positives » 39/83 (46%), Gaps « 5/83 (6%) 

Query: 131 NEDTTSNTD ETSNQNATS LDNSTGMTANRNAYVS L PQS EVNI D VDNTTLRFADNNT IDNG 190 

N + +N +N N+ S +NS +N N+ + P N D DN T +NNT +N 
Sbjct: 404 NNNNNNNIINNNNSNSNSNNNSNN-NSNNNSNRNSPNHNNNGDNDNNT NNNTNNNN 458 

Query: 191 KTVNKSSNESNQNAKRNQNQKGN 213 

N ++N +N N N N N 
Sbjct: 459 NNNNNNNNNNNNNNNNNNNNNNN 481 



Score = 41.4 bits (95), Expect = 0.006 

Identities = 22/88 (25%), Positives * 43/88 (48%), Gaps « 6/88 (6%) 

Query: 130 HNEDTTSNTDETSNQNATSLDN STGMTANRNAYVS LPQS EVNIDVDNTTLRFADNNT 186 

+ ++ +N++ SN N+ + +N + G AN++ + P + +N ♦ DN +NN 
Sbjct: 337 NRNNSNNNSNNNSNNNSNNSNNRNITNGSNANKS NSPNNNLNTNNDNKNNNSNNNNN 393 

Query: 187 IDNGKTVNKSSNESNQNAKRNQNQKGNA 214 

+N S+N +N N N N N+ 

Sbjct: 394 SNNNSNNGNSNNNNNNNI INNNNSNSNS 421 



Score =40.6 bits (93). Expect = 0.011 

Identities « 24/101 (23%), Positives « 41/101 (39%), Gaps = 2/101 (1ft) 

Query: 115 SSEVEKYLQSQ/3FTEHNEDTTSNTDETSNQNATSLDNSTGKTANRNAWSLPQ 174 
S+ L + ++N *N ♦+ N S +N+ N N S + N + 
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Sbjct: 370 SNSPNNNLNTONDNKNNNSNNNNNSNNNSNNGN^ 429 

Query: 175 DNTTLRFADN- -NTIDNGKTVTIKSSNESNQNAKRNQNQKGN 213 

4N + R ♦ N N DN N ++N *N N N N N 
Sbjct: 430 NNNSNRNS PNHNmGDNDNNTNNNTNNNNNNNNNNNNNNNN 470 

Score » 40.2 bits (92), Expect * 0.014 

Identities = 21/80. (26%) , Positives = 39/80 (48%), Gaps » 9/80 (11%) 

Query: 130 HNEDTTSNTDETSNQNATSIJDNSTGOTANR^ 189 

+N D +NT+ +N N + +N+ N N H ♦ +N +ADN+ 

Sbjct: 442 NNGDNDNNTNNNTKNNNNNNNNNNNNNNNNNN NNNNNNNNNNYADNSNNNS 4 92 

Query: 190 GKTVNKSSNESNQNAKRNQN 209 

+ N +SM +N N +N+N 
Sbjct: 493 SNSNNNNSNSNNNNDNKNEN 512 

Score = 39.5 bits (90), Expect = 0.024 

Identities * 26/111 (23%) , Positives *» 44/111 (39%), Gaps = 20/111 (18%) 

Query: 112 VYSSSEVEKYLQSQ- -GFTEHNEDTTSNTDETSNQNATSIJDNSTGMTANRNAYVSLPQSE 169 

VY + K+ ♦ + G +N ++ +N++ SN N ++N N N 
Sbjct: 296 VYCTHHHTKFYETHRNGLLNNNNNSN^ - 346 

Query: 170 VNIDVDNTTLRFADNNTIDNGKTVNKSS NESNQNAKRNQNQKGNA 214 

+ N ++N I NG NKS+ N +N N N N N+ 

Sbjct: 347 NNSNNNSNNSNNRNI TNGSNANKSNS PNNNIJWTNNDNKNNNSNNNNNS 394 

Score b 37.5 bits (85), Expect » 0.094 

Identities = 24/96 (25%) , Positives « 41/96 (42%) , Gaps = 1/96 (1%) 

Query: 124 SQGFTEHNEDTTSNTDETSNQNATSLDNSTGM - TANRNAYVSLPQSEVNIDVDNTTLRFA 182 

S + +N + SN + ++ N DN+T TNN + +N + +N 
Sbjct: 421 SN1MSNNNSNNNSNRNSPNHNNNGDNDNNTNNOT 480 

Query: 183 DNNTIDNGKTVNKSSNESNQNAKRNQNQKGNAKGTQ 218 

+NN DN + +SN +N N+ N + K Q 
Sbjct: 481 NNNYADNSNNNS SNSNNNNSNSNNNNDNKNEN SDNQ 516 

Score =35.6 bits (80), Expect = 0.36 

Identities = 25/99 (25%), Positives = 42/99 (42%), Gaps = 18/99 (18%) 

Query: 130 HN EDTTSNTD ET SNQNATS LDN ST - GMTANRNAYVSLPQSEVNIDVDNTTLRFADNNTI D 188 

+N + SN + +N N ++ N T G AN++ + P ♦ +N + DN +NN + 

Sbjct: 339 NNSNNNSNNNSNNNSNNSNNRNITNGSNANKS- - - NS PNNNLNTNNDNKNNNSNNNNNSN 395 

Query: 189 NGKTV NKSSNESNQNAKRNQNQKGN 213 

N N S++ SN N+ N N N 

Sbjct: 396 NNSNNGNSNNNNNNNI INNNNSNSNSNNNSNNNSNNNSN 434 

Score o 35.2 bits (79), Expect =0.47 

Identities = 21/94 (22%) , Positives = 42/94 (44%) . Gaps = 5/94 (5%) 

Query: 124 SQGFTEHNEDTTSNTOETSNQNATSIJDNSTGMTA^ 183 

+ G + +♦ +N T+N N + N+ N N+ + N + +N ♦ ♦ 

Sbjct: 362 TNGSNANKSNS PNNNLNTNNDNKNNNSNN NNNSNNNSNNGNSNNNNNNNI INNNN 416 

Query: 184 NNT I DNGKTVNKS SNESNQNAKRNQNQ KGN AKGT 217 

+N+ N ♦ N S+N SN+N+ ♦ N N T 
Sbjct: 417 S NS N SNNNSNNNSNNNSNRNS PNHNNNGDNDNNT 450 

Score = 35.2 bits (79), Expect » 0.47 

Identities = 29/118 (24%), Positives « 53/118 (44%), Gaps = 12/118 (10%) 

Query: 115 SSEVEKYLQS-QGFTEHNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVNID 173 

SS+ E <►+ +GF + + T+N ++N D S+G + + V+ P+S +N 

Sbjct: 114 S SDS EAD I EDDKG FQD - - K P I TTNNSGSNN PLKNLKD YS SG S SG S SRSGVNQ PRSN I NNS 171 

Query: 174 VDNTTLRFADNNT I DNGKTVNKS SNESNQNAKRNQNQKGNAKGTQFTKQ 222 

D ♦ + +N+ I + T + NQN +NQNQ N Q +Q 
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Sbjct: 172 NDKYKSKSSSSNSNSSSSGGSLISSLLTGGNTYQNQNQNQNQNQNQNNNQSQLQQQQQ 229 
Score = 34.4 bits (77), Expect = 0.81 

Identities = 24/94 (25%), Positives = 38/94 (39%), Gaps « 12/94 (12%) 

Query: 131 NEDTTSNTDBTSNQNATSUJNSTGMTANIWAYV^^ 190 

N +T +N + N + +N+ N N S N N +NN+ N 

Sbjct: 451 NNNTNNNNNNNNNNNNNNNNKNNNN^ NNNSNSNN 504 

Query: 191 KTVKKS SNE SNQNAKR NQNQKGNAKGTQ 218 

NK+ N NQ+ R ++NQK 4- Q 

Sbjct: 505 NNDNKNENSDNQS VLRSNEKFTD ENQ KNG SDDQQ 538 

Score o 33.6 bits (75), Expect » 1.4 

Identities « 22/90 (24%), Positives - 35/90 (38%) 

^ ^ S N SN N N+ N N + + N + +N 

Sbjct: 353 SNNSNNRNITNGSNANKSNSPNNNLNTNND^ I 412 

Query: 184 NNT I DNGKTVNKS SNE SNQNAKRNQNQKGN 213 

NN N + N S+N SN N+ RN N 
Sbjct: 413 NNNNSNSNSNNNSNNNSNNNSNRNS PNHNN 442 

>gi|535260|emb|CAA82996| (Z30339) STARP antigen (Plasmodium 
reichenowi] 
Length =655 

Score » 44.5 bits (103), Expect = 7e~04 

Identities = 31/114 (27%), Positives * 47/114 (41%), Gaps = 14/114 (12%) 

Query: 128 TEHNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQ S EVN IDVDNTTLRF 181 

T++N T TD + + +N+T AN + ++ N D +NT + 

Sbjct: 433 TDNNNTNTKATDSNNTNTKATDNNNTNTKATDNNNTOT 492 

Query: 182 ADNNTI DNGKTVNKSSNESNQNAKRNQNQKGNAKGT QFTKQYLIDN 227 

DNN DN T K+++ +NNK N NKT T QY+ N 

Sbjct: 493 TDNNNTNTKATDNNNTNTKATDNNNTNT1CATO 546 

Score = 44.5 bits (103), Expect =» 7e-04 

Identities * 30/103 (29%), Positives = 44/103 (42%), Gaps = 13/103 (12%) 

Query: 128 TEHNEDTTSNTD ETSNQNATS LDNS TGMTANRNAYVSLPQSEVN IDVDNTTL 179 

T++N T TD+++N + + DN+ T T N N S D +NT 

Sbjct: 401 TDNNNTDTKATDKSNNTDTKATDNNNNTDTKATO 460 

Query: 180 RFADNNTI DNGKTVNKS SNESNQNAKRNQNQKGNAKGT 217 

+ DNN DN T K+++ +N N K N NKT 

Sbjct: 461 KATDNNmmTCATDNNNTNTKATDNNNTNT^ 503 

Score = 42.6 bits (98), Expect = 0.003 

Identities * 27/96 (28%), Positives = 43/96 (44%), Gaps « 10/96 (10%) 

Query: 128 TEHNEDTTSNTD ETSNQNATS LD - NSTGMTANRNAYVS LPQSEVNIDVDNTTLRFADNNT 186 

T++N +T + + +N N + D N+T AN + ++ N NT + DNN 
Sbjct: 422 TDNNNNTDTKATDNNNTNTKATDSNNTNTKATO NTNTKATDNNN 4 77 

Query: 187 I DNGKTVNKS SNESNQNAKRNQNQKGNAKGT 217 

DN T K+++ +N N K N NKT 
Sbjct: 478 TNTKATDNNNTNTKATDNNNTNTKATDNNNTNTKAT 513 

Score = 41.8 bits (96), Expect = 0.005 

Identities = 35/150 (23%), Positives = 59/150 (39%), Gaps =. 9/150 (6%) 

Query: 85 EINRQTVEAFGMQVIWCITHEDYLNVVYSSSEVEKYI^ 144 

E N+ ++ G T+ + N + E ♦ +Q T +N TT+ + N 

Sbjct: 118 ETNKTNIKLTGNNSTTINTNLTENTNA- -TiaCLTENVITNQILTGNNNTTTNTSSTEHNN 175 

Query: 145 NATSLDNSTGMTANRNAYVSLPQS EVN I DVDNTTLRFADNNT I DNGKTVNKS SNESNQNA 204 
N ♦ NSTG T+ NI + N L +N T ♦ T + +♦ +N N+ 
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Sbjct: 176 NINTNTNSTGNTSTTKKLTE NI - ITNQILTGNNNTTTNTSSTEHNNNINTNTNS 228 

Query: 205 KRNQNQKGN AKGTQ FTKQY L I DN I D KAYDL 234 

N N N T + DNI+ +L 

Sbjct: 229 TDNSNTNTNLTD ITTTTKKWTDN I NTTQNL 258 

Score b 41.8 bits (96), Expect = 0.005 

Identities = 30/101 (29%), Positives « 43/101 (41%), Gaps = 13/101 (12%) 

Query: 130 HNEDTTSNTDETSNQNATSLDNS -TGWTANRNAYVSLPQSEVNIDV DNTTLRFA 182 

+N DT S ++ +♦ AT DN+ T T N N + N D +NT + 

Sbjct: 363 NNTDTI STDNDNTDTKATDNDNTDTKATDNNNNTDTKATO 422 

Query: 183 DNN T I DNGKTVNKSSNESNQN AKRNQNQKGNAKGT 217 

DNN DN T K+++ +N N K N N K T 

Sbjct: 423 DNNNNTDTKATDNNNTNTKATDSNNTNTKATO 463 

Score o 40.6 bits (93), Expect = 0.011 

Identities « 31/121 (25%), Positives * 47/121 (38%), Gaps = 31/121 (25%) 

Query: 128 TEHNEDTTSNTDETSNQNAT SLDNSTGMTANRNAYVSLPQSEVN- 171 

TEHN + +NT+ TN + T ++ + +T N N + +EN 

Sbjct: 171 TEHNNNItrTNTNSTGNTSTTKKLTENIITNQILTGN^ 230 

Query: 172 1 DVDNTT LRFADN NT IDNGKTVNKS SNE SNQNAKRNQNQKGNAKG 216 

D+ TT ++ DN T N TV+ +H +N N K N N K 

Sbjct: 231 NSNTNTNLTDITTTTKKWTDNINTTQNLTTSTN^ 290 

Query: 217 T 217 
T 

Sbjct: 291 T 291 
Score = 38.3 bits (87), Expect = 0.055 

Identities = 28/98 (28%), Positives - 41/98 (41%), Gaps o 10/98 (10%) 

Query: 128 TEHNEDTTSNTDETSNQNATS LDNSTGMTANRNAYVSLPQSEVN ID VD - NTTLRFADNNT 186 

TEHN + +NT+ S N+ + N T +T + + N+ NTT DNN 

Sbjct: 216 TEHNNNINTNTN--STONSNTNTNLTDITTTTIOCWTD 273 



Query: 187 IDNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

DN T KS++ N K N+ + K T 
Sbjct: 274 NNINTKPTDNNNTNIKSTDNYNTGTKETDNKNTDIKAT 311 

Score =37.5 bits (85), Expect * 0.094 

Identities = 31/106 (29%), Positives = 45/106 (42%), Gaps = 18/106 (16%) 

Query: 128 TEHNEDTTSNTDETSNQN AT SLDNSTGMTANRNAYVSLPQSEVN IDVDN 176 

T++N +T +T T N N AT N+T AN + ++ N D +N 

Sbjct: 390 TDNNNNT- - DTKATDNNNTDTKATDKSNNTDTKATDNNNNTDTKATDN^ 447 

Query: 177 TTLRFADNN TIDNG KTVNKS SNESNQNAKRNQNQKGNAKGT 217 

T + DNN DN T K+«-+ +N N K N N K T 

Sbjct: 44 8 TNTKATDNNNTNTKATDNNNTNTKATDN 493 

Score = 35.2 bits (79), Expect * 0.47 

Identities * 24/109 (22%), Positives » 46/109 (42%), Gaps = 6/109 (5%) 

Query: 128 T EHNEDTT SNTDETSNQNATS LDNS TGMTANRN A YVS L PQS EVN 1 DVDNTT LRF 181 

T++N T TD + + +N+T AN + ++ N D +NT + 

Sbjct: 473 TDNNNTNTKATDNNNTNTKATDNNNTNTKATDNNNTNT KATDNNNTNTKATDNNNTNTKA 532 

Query: 182 ADNNTIDNGKTVNKSSNESNQNAKRNQNQKGNAKGTQFTKQ YLIDN I DK 230 

DNN N ♦ +E+ + K N++ N++ + K + +DK 

Sbjct: 533 TDNNNNTNQYVFANNYDETTSDDKLNKDSCDNSEEKENI KSMINAYLDK 581 



Score « 34.4 bits (77), Expect =0.81 

Identities = 26/126 (20%), Positives » 46/126 (35%), Gaps « 7/126 (5%) 
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Query: 99 IT^CITHEDYU^Ty'SSSEVEKYr^SQGFTEHNEDTTStm>ETSNQNATSIZ}NSTGOTAN 158 

IT T+ ♦ S + V S T +N T N N ++ T 

Sbjct: 318 ITTDNTNTNV I STDNS KTNVI SKDNSNTHT I STDNS KTNV I STDNNNTDTI STDNDNTDT 377 

Query: 159 RNAYVS LPQS BVN I D VDNTTLRF ADNNT I D NGKTVNKSSNESNQNAKRNQNQK 211 

♦ ++ + +NT ♦ DNN D N +N+N + KN 

Sbjct: 378 KATDNDNTDTKATDNNNNTDTKATDNNNTDTKATD 437 



Query: 
Sbjct : 



212 GNAKGT 217 

N K T 
438 TNTKAT 443 



Score = 34.4 bits (77), Expect ■ 0.81 
Identities = 30/100 (30ft) , Positives = 



44/100 (44ft), Gaps = 14/100 (14ft) 



Query: 
Sbjct: 
Query; 
Sbjct: 



131 NEDTTSNTDETSNQNATSLDNS - TGMTANRNAY VSLPQSEVNI DVDNTTLRFAD 183 

N + T TDTNN S DNS T ♦ + N+ +S S+ N+ D +NT D 
313 NNNITITTDNT - NTNVI STDNS KTNVI SKDNSNTHTI STDNSKTNVI STDNNNTDT I STD 371 

184 NNTIDNGKTVNKSS NESNQNAKRNQNQKGNAKGT 217 

N+D T N N +N ♦ K N +KT 

372 NDNTDTKATDNDNTDTKATDNNNNTDTKATDNNNTDTKAT 411 



Score » 34.4 bits (77), Expect = 0.81 

Identities « 28/101 (27ft) , Positives = 41/101 (39ft) , Gaps « 15/101 (14ft) 

Query: 131 NEDTTSNTDETSNQNATSLDNSTGMTA- -NRNAYVSLPQSEVNIDV DNTTLRFA 182 

N DT + ++ ++ AT +N+T ANN N D +NT + 

Sbjct: 374 NTDTKATDNDNTDTKATDNNNNTDTKATDNNNTDTKA 433 

Query: 183 DMNTIDNGK TVNKSSNESNQNAKRNQNQKGNAKGT 217 

DNN N K T K+++ +N N K N N K T 

Sbjct: 434 DNNN-TNTKATDSNNTNTKATDNNNTNTKATDNNNTNTKAT 473 

Score = 32.5 bits (72), Expect « 3.1 

Identities = 30/110 (27ft), Positives = 40/110 (36ft), Gaps » 23/110 (20%) 



Query: 



131 NEDTTSNTDETSNQNATSLDNS TGMTANRNAYVSLPQS EVN I DVDNTTLRF 181 

N +TT N ++N S DN+ TTNN+ + DNT++ 

Sbjct: 251 NINTTQNLTTSTNTTTVSTDNNNNNINTKPTDNNNTNI KSTDNYNTGTKETDNKNTD I KA 310 



Query: 182 ADNNTI DNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 

DNN I DNKTS + SN+ NKNT 

Sbjct: 311 TDNNNITITTDNTNTNV1 STDNSKTNVI SKDNSNTHTISTDNSKTNVI ST 360 



>gi 1 1429240 | emb| CAA676S9 | (X99260) 
[Bacteriophage B103] 
Length =293 



lower collar protein 



Score » 43.8 bits (101), Expect = 0.001 

Identities * 53/204 (25ft) , Positives » 79/204 (37ft) , Gaps = 42/204 (20ft) 

Query: 56 EKVFKG FSLKDELSDLLFKKS FTI HFLD RE INRQTVEAFGMQVI TVC ITHED 107 

EK+ KG F + ♦ D ++K F HF+ REI +T F + T I + 

Sbjct: 26 EKI EKGRPKLFDFQY P I FDE S YRKVFETHFI RNFYMRE IG FETEGLFKFNLETWL I INMP 85 

Query: 108 YLNWYS S SEVEKY LQSQGFTEH NEDTT SNTDETSNQNA 146 

Y N +♦ S E+ KY L ♦ G ++ N DTT SNT ♦ NA 

Sbjct: 86 YFNKLFES - ELI KYDPLENTRLNTTGNKKNDTERNDNRDTTGSMKADGKSNTK^ 144 

Query: 147 TSLDNSTGMTA NRNAYVSLPQSEVNIDVDN- -TTLRFADNNTIDNGKTVNKS 196 

T G T NR PS +N+ ++ TL +A ♦ 1+ T NK 

Sbjct: 145 TGSSKEDGKTTGSVTDDNFTTOKIDSDQPDSRLNLTTNDGQGTLEYA- -SAIEENNTNNKR 202 

Query: 197 SNESNQNAKRNQNQKGNAKGTQFT 220 

+ N + + GT T 

Sbjct: 203 NTTGTNNVTSSAESESTGSGTSDT 226 



Query* pt| 110879 44AHJDORF009 Phage 44AHJD ORF | 5744-6496) 2 1 
(250 letters) 
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>gi 1 2764 981 1 emb | CAA69021.1) (Y07739) N-acetylmuramoyl-L- alanine 
amidase [Staphylococcus phage Twort) 
Length « 467 

Score b 180 bits (452) , Expect * Xe-44 

Identities » 89/157 (56ft) . Positives » 109/157 (68%) , Gaps = 8/157 (5%) 

Query: 1 MKSQQQAKEWIYKHEGAGVDFIX3AYGFQCMD 60 

MK+ +QA+ +1 G DFDG YG+QCMDL+V Y+Y+VTOGK+RMWGNAKDAINN F 

Sbjct: 1 MKTLKQAES Y I KSKVNTGTDFDGLYGYQCMDLAVDY I YHVTDG KI RMWGNAKDA INNS FG 60 

Query: 61 G LATVYKNT P S FKPQLGDVAVYTNGQ - - - YGHIQCVLS GNLDYYTCLEQNWLGGGF 113 

G ATVYKN P+F+P+ GDV V+T G YGHI V ♦ G+L Y T LEQNW G G 
Sbjct: 61 GTATVYKNYPAFIU'KYGDVVVWTTGNFATYGH IAIVTNPD P YGDLQYVTVLEQNWNGNG I 120 

Query: 114 IX5WEKATIRTHYYDGVTHFIRPKFSGSNS-KALETSK 149 

E ATIRTH Y G+THFIRP F+ +S K +T K 
Sbjct: 121 YKTELATIRTHDYTGITHFIRPNFATESSVKKKDTKK 157 

Score - 61.7 bits (147), Expect = 6e-09 

Identities » 41/125 (32ft), Positives » 57/125 (44ft) , Gaps = 8/125 (6ft) 

Query: 125 YYDGVTOFIRPKFSGSNSKALETSKVOTFGKWKRN^ 183 

YY+G T P +K + +T G W N YGTYY++E+ TF C I R 

Sbjct: 346 YYEGKTPV- - PTVVNQKAKTKPVKQSSTSG-WNVNNYGTYYKSESATFKCTARC<3rVTRY 402 

Query: 184 GS PKLSEPNGYWFQPNGYTPYNEVCLSDGYVWIGYNWQGTR - YYLPVRQWNGKTGNS YSV 242 

P + P Y+ VC DGYVWI + G + ++PVR W+ N+ + 
Sbjct: 403 TX3PFTTCPQAGVLYYGQSVTYDTVCKQDG YVWI SVTTTNGGQDVWMPVRTWD KNTD IM 4S9 

Query: 243 GIPWG 247 

G WG 
Sbjct: 460 GQLWG 464 

>gi|H3675|sp|P24556|ALYS STAAU AUTOLYSIN 

{ N - ACETYLMURAMO YL - L- ALANINE AMIDASE) 

>gi| 79687|pir | | JQ1147 N-acetylmuramoyl-L- alanine amidase 
(EC 3.5.1.28) - Staphylococcus aureus >gi| 153067 
(H76714) peptidoglycan hydrolase [Staphylococcus aureus] 
Length o 481 

Score = 118 bits (292), Expect « 6e-26 

Identities * 56/117 (47%), Positives « 68/117 (57%), Gaps * 1/117 (0%) 

Query: 135 PKFSGSNSKALETSKVNTFGK-WIOWQYGTYYRN^GTFTCGFLPIFARVGSPKLSEPNG 193 

p + SN + +♦ V WXRN+YGTYY E+ FT G PI R P LS P G 

Sbjct: 365 PVAT^SNESSASSNTWPVASAWKRNKYGTYYMEESARFTNGNQPITVRKVGPFLSCPVG 424 

Query: 194 YWFQ PNGYTP YNEVCI*SDGYWI GYNWQGTRYYLP VRQWNG KTGNS YSVGI PWGVFS 250 

Y FQP GY Y EV L DG+VW+GY W+G RYYLP+R WNG + +G WG S 
Sbjct: 425 YQFQPGG YOTYTEVMLQDGHVWG YTWEGQRYYLP I RTWNGSAP PNQ I LGDLWGE I S 481 

Score * 78.0 bits (189), Expect * 7e-l4 

Identities = 48/109 (44%), Positives = 62/109 (56%), Gaps = 6/109 (5%) 

Query: 15 EG AGVDFDG A YGFQCMD LSVA YVYYI TDG KVRMWGNAKD A - 1 NNDFKG LATVYKNT P S F K 73 

EG + D YGFQC D + A ♦ + G + AKD N+F GLATVY+NTP F 

Sbjct: 18 EGKQ FNVDLWYG FQCFD YANAG - WKVLFG LLLKG LGAKD I PFANNFDGLATVYQNTPDFL 76 . 

Query: 74 PQLGDVAVYTNGQ YGHIQCVLSGNLDYYTCLEQNWLGGGF - DGWEK 118 

Q GD+ V+ + YGH+ V+ LDY EQNWLGGG+ DG B+ 

Sbjct: 77 AQPGDMWFGSNYGAGYGHVAWVI EATLD YI I VYEQNWLGGGWTDG I EQ 125 

>gi| 1763243 (U72397) amidase [bacteriophage 80 alpha] 
Length » 481 

Score = 118 bits (292), Expect = 6e-26 

Identities « 56/117 (47%), Positives « 68/117 (57%), Gaps * 1/117 (Oft) 

Query: 135 P KFSGSNS KALETS KVNT FG K - WKRNQ YGTYY RNENGT FTCG FL P I F ARVGS PKLS E PNG 193 

P ♦ SN ♦ +♦ V WKRN+ YGTYY E+ FT G PI R P LS P G 

Sbjct: 365 PVATVSNESS ASSNTVKP VASAWKRNKYGTYYMEES AR FTNGNQP I TVRKVG P FLSCP VG 424 
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Query: 194 YWFQPNGYTPYNEVCLSDGYVWIGYNWOGTRYYLPVRQWNGKTGNSYSVGIPWGVFS 250 

Y FQP GY Y EVL DG+VW+GY W+G RYYLP+R WNG + +G WG S 
Sbjct: 425 YQFQPGGYCDYTEVMLQDGHVWGYTWEGQRYYLPIRTWNGSAPPNQILGDLWGEIS 481 



Score a 83.5 bits (203), Expect = 2e-15 

Identities = 50/115 (43%), Positives = 65/115 (56%), Gaps « 6/115 (5%) 

Query: 9 EWIYKHEGAGVDFDGAYGFQC^TOLSVAYVYYITDGKVRMW^ 67 

EW+ EG + D YGFQC D + A + + G + AKD N+F GLATVY+ 

Sbjct; 12 EWLKTSEGKQFNVDLWYGFQCFDYANAG-WKV1»FGLL^ 70 

Query: 68 NTPSFKPQLGDVAVYTNGQ- - - YGH I QCVLSGNLD YYT C LEQNWLGGG F - DG WE K 118 

NTP F Q GD+ V+ + YGH+ V+ LDY EQNWLGGG+ DG E+ 

Sbjct: 71 NTPDFLAQPGDMWFGSNYGAG YGHVAWVI EATLD YI I VYEQNWLGGGWTDGIEQ 125 



>gi|4574237|gb|AAD23962.l|AF1068Sl_l (AF106851) LytN (Staphylococcus 
aureus] 
Length « 383 

Score = 84.3 bits (205), Expect = 9e-16 

Identities » 48/128 (37%), Positives » 68/128 (52%), GapB « 7/128 (5%) 

Query: 15 EGAGVDFDG AYGFQCMDLSVA YVYY I TDG KVRMWGNAKDA INND F KG LATVYKNT P S F K P 74 

E G DFDG+YG+QC DL Y ++ ++ +<5 N+F A +Y NTP+FK 

Sbjct: 252 ENRGWDFDG S YGWQCFDL VNVYWNH LYGHGLKGYGAKD I PYANNFNS EAKI YHNTPTFKA 311 

Query: 75 QLGDVAVYT NGQYGH I QCVLSGNLD YYTCLEQNWLGGGFDGWEKATIRTHYYD 127 

+ GD+ V++ G YGH VL+G+ D + L+QNW GG+ E A H Y+ 

Sbjct: 312 EPGDLVVTSGRFGGGYGHTAIVTJIGDYIXSKLM^ 371 

Query: 128 GVTHFIRP 135 
FIRP 

Sbjct: 372 NDMIFIRP 379 



>gi|3767S93|dbj|BAA33856.l| (AB015195) LytN (Staphylococcus aureus] 
Length » 383 

Score = 84.3 bits (205), Expect * 9e-16 

Identities « 48/128 (37%), Positives » 68/128 (52%), Gaps = 7/128 (5%) 

Query: 15 EGAGVDFDG AYG FQ CMDLS VA YVYY I TDG KVRMWGNAKDA I NNDFKG LATVYKNTPS FKP 74 

E G DFDG+YG+QC DL Y ++ ++ +G N+F A +Y NTP+FK 

Sbjct: 252 ENRGWDFDG SYGWQCETDLVNVYWNH LYGHGLKGYGAKD I PYANNFNSEAKI YHNTPTFKA 311 

Query: 75 QLGDVAVYT NGQYGHIQCVLSGNLD YYTCLEQNWLGGGFDGWEKATIRTHYYD 127 

+ GD+ V++ G YGH VL+G+ D ♦ L+QNW GG+ E A H Y+ 

Sbjct: 312 EPGDLWFSGRFGGGYGHTAI VLNGDYDGKLMKFQS LDQNWNNGGWRKAEVAHKVVHNYE 371 

Query: 128 GVTHFIRP 135 
FIRP 

Sbjct: 372 NDMIFIRP 379 



>gi 1 2764 983 1 emb | CAA69022.1) (Y07740) cell wall hydrolase Plyl87 
[Staphylococcus phage 187] 
Length » 628 

Score m 76.9 bits (186), Expect « 2e-13 

Identities « 50/144 (34%), Positives « 68/144 (46%), Gaps = 18/144 (12%) 

Query: 5 QXJAKEWIYIOiEGAGVDFDGAYGFQCMDLSVAYVYYITDGKVRMW GNAKDAINNDF 59 

+Q +W G+GVD DG YG QC DL Y++ R W GNA+D + 

Sbjct: 12 KQWDWAINLIGSGVDVDGYYGRQCWDLP - NYI FN RYWNFKTPGNARDMAWYRY 64 

Query: 60 KGLATVYKNT PS FKPQLGDVAVYTNGQ Y GHIQCVLS - GNLDYYTC LEQNWLGGG F 113 

V++NT F P+ GD+AV+T G Y GH V+ Y+ ++QNW 

Sbjct: 65 PEGFKVFRNTSDFVPKPGDIAVWTCGNYNWNTWGHTGIWGPSTKSYF^ 124 

Query: 114 DGWEKATIRTHYYDGVTHFIRPKF 137 

A H Y GVTHF+RP + 
Sbjct: 125 YVGS PAAKI KHSYFGVTHFVRPAY 148 
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>gi| 3287732 |sp| 005156 |ALE1_STACP GLYCYL -GLYCINE ENDO PEPTIDASE ALE-1 
PRECURSOR >gi|l890068|dbj|BAA13069| (D86328) ALB-1 
(Staphylococcus capitis] 
Length o 3 62 

Score « 73.4 bits (177), Expect « 2e-12 

Identities « 47/117 (40*), Positives * 61/117 (51%), Gaps - 10/117 (8%) 

Query: 132 F I RP KFSGSN SKALETS KVNTFGKWKRNQYGTYYRNENGTFTCGFLP I FARVG S PKLSE P 191 

F++ GSNS TS N G 4K N+YGT Y++E+ +FT I R+ P S P 

Sbjct: 252 FLKSAGYGSNS TSSSNNNG- YKTNKYGTLYKSESASFTAN- TDI ITRLTGPFRSMP 305 

Query: 192 NG YWFQPNG YTPYNEVCLSDG YVWI GYNW - QGTRYYLPVRQWNGKTGNS YSVG I PWG 247 

+ Y+EV DG+VW+GYN G R YLPVR WK TG +G WG 

Sbjct: 306 QSGVIJIKGLTIKYDEVMKQDGHVWVGYNTNSGKRVYLPVRTWNESTG- - -ELGPLWG 359 

>gi I 79926 |pir| I A25881 lysostaphin precursor - Staphylococcus 

simulans >gi| 153047 (M1S686) lysostaphin (ttg start 
codon) [Staphylococcus simulans) 
Length = 389 

Score » 69. S bits (167), Expect » 3e-ll 

Identities « 48/133 (36%), Positives » 62/133 (46%), Gaps « 20/133 (15%) 

Query: 131 HFIRPKFSGSNSKALETS KVNTFGK WKRNQYGTYYRNENGTFTCG 175 

HF R S SNS A + K +GK WK N+YGT Y++E+ +FT 

Sbjct: 258 HFQRMVNS FSNSTAQD PMPFLKS AG YGKAGGTVTPTPNTGWKTNKYGTLYKS ES AS FT PN 317 

Query: 176 FLP I FARVGS PKLSE PNG YWFQ PNGYTPYNEVCLSDG YVWIGYNW - QGTRYYLPVRQWNG 234 

I R P S P + Y+EV DG+VW+GY G R YLPVR WN 

Sbjct: 318 - TDI ITRTTGPFRSMPQSG VLKAGGTIHYDEVMKQDGHVTW 376 

Query: 235 KTGNSYSVGI PWG 247 

T ++G+ WG 
Sbjct: 377 STN TLGVLWG 386 

>gi|l26496|sp|P10548|LSTP_STAST LYSOSTAPHIN PRECURSOR 

(GLYCYL-GLYCINE ENDO PEPTIDASE) >gi | 79927 jpir | | S01079 
lysostaphin precursor - Staphylococcus simulans bv. 
staphylolyticus >gi | 581744 |emb|CAA294 94 | (X06121) 
lysostaphin (AA 1-480) (Staphylococcus simulans bv. 
staphyloly t icus ] 
Length « 480 

Score « 69.5 bits (167), Expect - 3e-ll 

Identities « 48/133 (36%), Positives » 62/133 (46%), Gaps * 20/133 (15%) 

Query: 131 HFIRPKFSGSNSKALETS KVNTFGK WKRNQYGTYYRNENGTFTCG 175 

HF R S SNS A + K +GK WK N+YGT Y++E+ +FT 

Sbjct: 349 H FQRMVNS F SNST AQD PM P F LKS AG YGKAGGTVT PT PNTGWKTNKYGT L YKS ES AS FT PN 4 08 

Query: 176 FLPI FARVGS PKLS EPNGYWFQPNGYTPYNEVCLSDGYVWIGYNW - QGTRYYLPVRQWNG 234 

I R P S P + Y+EV DG+VW+GY G R YLPVR WN 

Sbjct: 409 - TDI ITRTTG PFRSMPQSG VLKAGQT I HYDEVMKQDGHVWVG YTGNSGQRI YLPVRTWNK 467 

Query: 235 KTGNSYSVGI PWG 247 

T ++G+ WG 
Sbjct: 468 STN TLGVLWG 477 

>gi|3287967|sp|P10547|LSTP_STASI LYSOSTAPHIN PRECURSOR 

( GLYCYL - G LYC INE ENDOPEPTIDASE) >gi| 2072411 (U66883) 
lysostaphin (Staphylococcus simulans) 
Length » 493 

Score « 69.5 bits (167), Expect « 3e^ll 

Identities » 48/133 (36%), Positives => 62/133 (46%), Gaps = 20/133 (15%) 

Query: 131 HFIRPKFSGSNSKALETS KVNTFGK WKRNQYGTYYRNENGTFTCG 175 

HP R S SNS A + K +GK WK N+YGT Y++E+ +FT 

Sbjct: 362 HFQRMVNSFSNSTAQDPMPFLKSAGYGKAGGTVTPTPNTGWKTNKYGTLYKSESASFTPN 4 21 



Query: 176 FLP I FARVGS PKLSE PNG YWFQPNG YTPYNEVCLSDG YVWIGYNW -QGTRYYLPVRQWNG 234 
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I R P S P ♦ Y+EV DG+VW+GY G R YLPVR WN 

Sbjct: 422 - TD 1 1 TRTTG PFRSMPQSGVLKAGQTI HYDEVMKQDGHVWVG YTGNSGQR I YLPVRTWNK 4 80 

Query: 235 KTGNSYSVGI PWG 247 

T ++G+ WG 
Sbjct: 481 STN TLG VLWG 490 



>gi 1 3341932 |dbj|BAA31898.l| (AB009866) amidase (peptidoglycan 
hydrolase) [bacteriophage phi PVL] 
Length =484 

Score « 68.3 bits (164), Expect « 6e-ll 

Identities = 52/150 (34%), Positives » 71/150 (46%), Gaps =» 17/150 (11%) 

Query: 3 SQQQWCEWIYKHEGTteVTJFDGAYGFQCMDLSVAYV^ 62 

+-»■ QA++W G + D YGFQC D++ + IG+R+G IDK 

Sbjct: 4 TKNQ AE KW FDNS LGKQ FN PDL FYG FQCYDYASMF - FMIATGE - RLQGLYAYN I PFDNKAR 61 

Query: 63 ATVY KNTPSFKPQLGDVAVYTN- - - GQYGH I QCVLSGNLD YYTCLEQNWLGGGF - - 113 

Y KN SF PQ D+ V+ + G GH++ V S NL+ +T QNW G G+ 
Sbjct: 62 I EKYGQI I KNYDSFLPQKLDI WFPSKYGGGAGHVE I VESANLNTFTS FGQNWNGKGWTN 121 

Query: 114 DGW- -EKATIRTHYYDGVTHFIRPKF 137 

GW E T HYYD +FIR F 
Sbjct: 122 GVAQPGWGPETVTRHVHYYDDPMYFIRLNF 151 



Query* pt|ll0882 44AHJDORF012 Phage 44AHJD ORF | 8391-8813 |3 1 

(140 letters) 

>gi|l40528|sp|P2481l|YQXH BACSU HYPOTHETICAL 15.7 KD PROTEIN IN 
SPOIIIC-CWLA INTBRGENIC REGION (ORF2) 
>gi|3221B9|pir| |B44816 orf2 5*o£ autolytic amidase - 
Bacillus subtilis >gi| 142801 (N59232) open reading frame 
2 [Bacillus subtilis] >gi 1 1217874 | dbj | BAA06959 1 (D32216) 
ORF121 [Bacillus subtilis] >gi| 1303767) dbj |BAA12423 | 
(D84432) YqdD [Bacillus subtilis] 

>gi|2635036|emb|CAB14532| (Z99117) alternate gene name: 
yqdD; similar to holin [Bacillus subtilis] 
Length « 140 

Score * 80.4 bits (195), Expect = 6e-15 

Identities = 45/130 (34%), Positives « 67/130 (50%), Gaps = 3/130 (2%) 

Query: 4 VTCFRFTDSEAFHMFIYAGDIJCLLYFLFVLMFVDIITG 63 

+ F D ++P G +K L L VL +D+VTG+ KA K L S+ + G+ +K 
Sbjct: 8 INFETLDLARVYLF GGVKYLDLLLVLS 1 1 DVLTGVI KAWKFKKLRS RS AWFG YVRKL 64 

Query: 64 XXXXXXXXXXXXXXXXXXKGGLLMITIFYYIANEGLSIVENCAEMDVLVPEQIKDKLRVI 123 

G L T+ +YIANEGLSI EN A++ V +P I D-I-L+ I 
Sbjct: 65 IiNFFAVILANVIIHVLNLNGVTiTFGTVLFYIANEGLSITENLAQIGVTCIPSSITO 124 

Query: 124 KNDTEKSDNN 133 

+N+ E+S NN 
Sbjct: 125 ENEKEQSKNN 134 



>gi|412663l|dbj|BAA36651.1| (AB016282) ORF45 (bacteriophage phi-105) 
Length « 135 

Score = 76.1 bits (184), Expect =» le-13 

Identities = 44/115 (38%), Positives « 61/115 (52%), Gaps = 4/115 (3%) 

G++K L + VL +DIITG+ KA K L S+ + G+ +K 
Sbjct: 17 GEVKYLDLMLVLNI IDI ITGVIKAWKFKELRSRSAWFGYV1UCMLSFLWIVANAIDTIMD 76 

Query: 81 XKGGLLMITIFYYIANEGIiSIVENCAEMDVLVPEQIKDKLRVIKND TEKSD 131 

G L T+ +YIANEGLSI EN A++ V +P I D+L VI++D TEK D 
Sbjct: 77 LNGVLTFATVLFYIANEGLSITENLAQIGVKI PAVITDRLHVIESDNDQKTEKDD 131 



>gi|l41088|sp|P26835|YNGD CLOPE HYPOTHETICAL 14.9 KD PROTEIN IN NAGH 
3'REGION (ORFD7 >gi | 1075967 | pir | | S43905 hypothetical 
protein D - Clostridium perfringens >gi| 455154 (M81878) 



WO 00/32825 



PCT/1B99/02040 



311 



ORF D (Clostridium perfringens] 
Length =132 



Score » 60.9 bits (145), Expect * 4e-09 

Identities = 38/127 (29%), Positives = 63/127 (48%), Gaps = 3/127 (2%) 

Query: 1 MNEVKFRFTDS EAFHMFI Y - AGDUCLLYFLFVLMFVD I ITG I S KAI KNNN LWS KKSMRGF 59 

♦N +K+ +1+ A D+ L+ L V +F+D +TG+ K K+ L S +RG 

Sbjct: 5 IOTIKWGIVSIXSTLFTWIFGAWDIPLIT^^ 63 

Query: 60 SKKXXXXXXXXXXXXXXXXXXXKGGLIJ^ITI-ETYIANEGLSIVENCAEMDVLVPEQIKD 118 

+KK + I ++YI NEG+SI+ENCA + V +PB++K 

Sbjct: 64 TKKGLI LVV1#LVAVMLDRLIjDNGTWMFRTIjI A YF YIMNEG I S I LENCAALGVP I PEKLKQ 123 

Query: 119 KLRVIKN 125 

L+ + N 
Sbjct: 124 ALKQLNN 130 

>gi | 2293160 (AF008220) YtkC [Bacillus subtilis) 

>gi|2635548|emb|CAB15042| (Z99119) similar to autolytic 
amidase [Bacillus subtilis] 
Length = 134 

Score =36.4 bits (82), Expect = 0.099 

Identities = 25/109 (22%), Positives = 41/109 (36%) 

Query: 17 FIYAGDLKLLYFIjFVLMFVDI ITGISKAIKIOJNLWSkKSMRGFSKKXXXXXXXXXXXXXX 76 

F+G LLM++I+K +LKK KK 

Sbjct: 20 FFFGGFQYSFLILLSIiMAIEFISTTLKETIIHKIiSFKXVFARLVKXLVTLALISVCT 79 

+G + + I +YI E + IV + + ♦ VP+ + D L +KN 
Sbjct: 80 QLLNTQGS IRDLAIMFYI LYES VQ I WTASSLG I P VPQMLVDLLETLKN 128 

>gi|1181973|emb|CAA87743.l| (Z47794) hoi in protein [Bacteriophage 
CP-1) 

Length =134 
Score = 31.3 bits (69), Expect =3.3 

Identities » 27/88 (30%), Positives o 36/88 (40%), Gaps = 5/88 (5%) 

LF L+ D ITG KA K S ++G K G +L 

Sbjct: 18 L FAL I LFDFI TG FLKAWKWKVTDSWTGLKGV I KHTLTF I FYYFVAVFLTYI HAMAVGQI L 77 

Query: 87 MITIFYYIANEGLSIVENCAEMDVLVPE 114 

++ I Y A LSI+EN A M V +P+ 
Sbjct: 78 LVI INLYYA LS IMENLAVMGVFI PK 102 
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Table 21 

Phage 182 complete genome sequence. 17833 nucleotides. 

1 tagaatattg tcataaaaca caaacataat aatgcatatt attgtttaca aatatgtaat ttcgtgatat 

71 aatacatttg taagttaaag gaggtgacaa aagaacaaat cataaatgct ttagaaattg caaaaactat 

141 tggaggaaaa ataatgaaat attcactaca acaaatagat gaaattaaat caacaatttt cagaattaga 

211 ttaaaaaggc atgaactaga ggaattggtg gacgaagtaa acgatattgc taaagatccg gaggaaagat 

261 atcttttatc gttttattac acagaagaag aacgtttgtt tgaaattccc tctgcaagat taatagatta 

351 ttacaacgaa aagatcacaa atctgaaatc ggaaatcata tcactcgaaa aaagattaca aaaactagta 

421 aaataatcac acaaaaagct ttacaaatat aacacatcat gttatactaa aagagtagta agggaacgga 

491 aaatacctta cttcacacct caatcattct tatcaaaata caaaaggagg gaaaataatg ggtcgaaaac 

561 taatgcaacg aaacgtaaca tcaactaaag tagaattctc agaagttatc gtacaagatg gagcgccaac 

631 aattgtacca tgcgaaccag ttgtcttaac aggaaaactt tcagaagaaa aagctttatc agcgatcaaa 

701 cgtaaaaacc ctgataaaaa cgtagttgta acaaatgttt cacatgaaac agcgctttac acaatgccag 

771 tcgataaatt tatcgagtta gcagacaaat caacacaagc ctaataaaaa caaaactaaa acaaaacaga 

841 ggagattata atcatggaaa tcgtaaaaag cacatttgac acacaaacac cagaaggaat gttacaagta 

911 ttcaatgcca caaacggggc ttcaattccg ctacgtaacg caattggcga agtactagaa tcgaaagata 

981 ttctagttta ctcagacgaa gtttctggtt ttggtggagc cgaaccatca caagcagaac tagtcgcttt 

1051 cttcacagaa gatggtaaaa cttatgcggg tgtatcagca gtagcaacaa aatcagctaa aaacctaatt 

1121 gatatgatga ctgctaaccc tgacatcaaa ccaaaaattt cttttgtcga aggaaaatca aacggtggac 

1191 aaaaatttgt aaatctacaa gtggtttcac tgtagcataa aaatacagga atctagtaag ccacttagcg 

1261 aatctcgcta ggtggttttt attatgtttc tacattgagg tgtgtagaat tgaccgtaag aatatcaaag 

1331 aatgatagag ccaagttaga gaaaatctac ggtaaatcta acaaagctcg taaaaaatac aatcgtttaa 

1401 gacaaaaagg agttgaggaa aggcaacttc caactgttcc aacatcaaag aaaagactta ttgactacgt 

1471 aaaatcaaca aatatgagtc gtagtgattt taacaagatg ttagacgagt tggtagattt tgcacaacct 

1541 tacaacgaga attacatttt tgagatcaac aagcgaaatg ttgcaatctc aagagcgcaa atcaaagaag 

1611 cgcaaattaa aacagagcaa gctcaaaaag cgaaagaaga acactacaaa gagcttaaca aagttgaagt 

1681 taagaagccc acagaaaaca caattgtcac accaactatt ttaacagagt taggtgctga cttacctttt 

1751 caagcaatac cagattttaa tattgacgct ttcacttctc cagaaggagt tcagtcttat ttagaaaata 

1821 taggaaaaca agacgaacaa tattttgacg aaagagacca actttattac gacaatttca gacaagcgat 

1891 gtttactatt ttcaattcag acgctgacga tattgttcgt ttacttgact caatggggct tgatctattt 

1961 atgaaaacat atgttagtaa cttcttagac atgaaccttg actacattta tgacgaagca gaagtacaac 

2031 agaaaaaaga acaagtttac agtaagattg caaaagtgat cgagtctgaa acaggtggag aagtcccctc 

2101 atataacccc acgaagaaca tcacaattaa ttcagaaaca ggagaagaat tatgattaag aaatatactg 

2171 gcgactttga aacaacaact gatctcaacg attgtcgtgt atggtcgtgg ggcgtatgcg atatagacaa 

2241 cgttgacaat atgacgttcg gtttagaaat cgattctttt tttgagtggt gtaaaatgca aggcagcaca 

2311 gacatttatt tccacaacga aaaatttgac ggagagttta tgctttcatg gttattcaaa aatggtttca 

2381 aatggtgtaa agaagcaaaa gaagatcgaa cattctccac actcatatca aatatgggtc aatggtatgc 

2451 tttggaaatt tgttgggaag ttaattacac aacaacaaaa tcaggtaaaa cgaaaaaaga gaaatctcga 

2521 acaataattt atgatagcct taaaaaatac ccttttccag tgaaacaaat tgcagaagct tttaattttc 

2591 ctataaaaaa aggcgaaata gattatacaa aagaaagacc tattggttac aaaccaacaa aagatgaatg 

2661 ggagtattta aagaacgaca ttcagattat ggcgatggca ttaaaaattc aattcgatca aggactaact 

2731 cgaatgacta gaggaagcga cgctttaggc gattacaaag attggctaaa agctacacat ggaaaatcaa 

2801 ctttcaaaca atggtttcct attttgtctt tagggtttga taaagactta cgtaaagcat acaaaggcgg 

2871 cttcacttgg gtaaacaaag tttttcaagg gaaagaaata ggtgacggca ttgtctttga tgtcaactct 

2941 ttgtatccct ctcaaatgta cgtaagacct ttaccatatg gaacacctct attctacgaa ggagaataca 

3011 aaccgaacaa cgactatccg ctgtacattc aaaatatcaa agtaagattc cgtttaaagg agggttatat 

3081 tccaaccatt caagttaagc aaagttcatt attcattcaa aacgaatatc ttgaatcaag tgtaaacaag 

3151 ttaggagttg acgaattaat cgatcttact cttacaaatg ttgacctaga attatttttt gaacactacg 

3221 atattttaga gatacattac acttacggat atatgttcaa agcttcttgt gatatgttca aaggctggat 

3291 cgataaatgg atcgaagtaa agaacaccac cgaaggggct agaaaagcta acgccaaagg tatgttaaat 

3361 agcttgtatg gaaagttcgg aacaaaccct gacattacag gaaaagtgcc ttacatgggc gaggacggca 

3431 ttgttcgatt gacactagga gaagaagaat taagagatcc tgtttatgtt ccgcttgcta gttttgtgac 

3501 ggcttggggt agatatacta ccattacaac cgctcaaaaa tgttttgatc gcattattta ttgtgataca 

3571 gatagcattc atctagtagg aacagaagtt ccagaagcaa tcgatcactt ggttgatcct aaaaaacttg 

3641 gttattgggg gcatgaaagc acatttcaac gagcaaaatt cattcggcag aaaacatacg tagaagaaat 

3711 tgatggcgaa ttaaatgtaa agtgtgctgg tatgccagat cgaataaaag agattgtaac ttttgacaat 

3781 tttgaagttg gtttttcaag ctatggaaag ttgctaccta aaagaacaca aggtggcgtg gtattagtag 

3851 acacaatgtt tacaatcaaa taaggaggac taataatgga actatataaa gcaatgttta tcgtacgtga 

3921 tgaaggtact attgacggtt acgatactga acactatgta gatatttctt tacatgactt tgaagaaata 

3991 tatggaaaag aaacacgtga aattgaagca gtaacattag taaaaacagg aaatttaaaa aaataaatta 

4061 tttacatcct ttgcaaagta tggtaaaata ttcttgtgat agttgacaag agtcaaattt ggcgagattg 

4131 ggcgaatgta cacgtgaaat atcgtgcgct cccgttaagt tatggacaca taaacgtttt gaccgtcaac 

4201 caatcgcaaa aaccttttag gagtagccct taaatgtggc tactcttttt tgtgtttcac agaattatgt 

4271 ttcacgtgaa acagttttta tggtataata gaatcaaaag gaggtggaga ttatggaaat taaagaaeaf 

4341 gaatcaattt taaatggtat tcttgaaagt gtcacagacg gtgaagcaag atcaaagatt gtagaacatc 

4411 ttgaagcatt gcgagaagac tacggagcaa caactgaagc tttgacatca gcaaatagca cacttgaaaa 

4481 gttaaagaaa gataacgaag cgttggttat ttcaaactca aaattgttcc gagaacgagc gatcgtagaa 

4551 ccagcagaaa ataacgaacc agaaacagac cagaatatta cactagacga tttaggaatt taaggaggaa 

4621 aaaacatggc tgacaaaatc acagaacaag atgttcttcg tgccacaaat gtagaaacac cagtacaatt 

4691 aatgactgct atttataata gttcatcatc tctttttcag gcgaacgtac ctatgccaaa tgcagataac 
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ttggtgcagg gatcacacgt 
taaagcagtt atccgataca 
ggtcgaacga ttgaagaaat 
caggggtatt taaacaggaa 
caaacaaacg atccaagaag 
gctggtgtaa tgaacgcttt 
caaaccacca agaaaaagag 
atttatccgt aagatcaaat 
gttaaaacat ctacctcaaa 
ttgacgtttt ageageggea 
gtttcctaaa aaagaaggcg 
atctacgaca aatcgtacaa 
accaccacca actatattct 
tgtcacaaaa gttgcttttg 
acatttacac cagtagaagc 
caacegtaaa acaaacagca 
agtaacattc acagctatcg 
caattatggc aagaaggtat 
cacaagatgg tttaaaactc 
agagattgtt cttatcaaag 
tatatgettg taactatctc 
tactgatatt gaatataaga 
cgtttcgata ttggtatacg 
tacctttcat taatacaatt 
ttttcatcct aacgatggag 
gaagataaat caggaggatc 
caagtgggga ggtatacaaa 
aacgaaagaa ccttttttaa 
gtggatcacg cgaacaaaac 
ctagtgatcc aacaggaaca 
taaaagaatt gatcttgtag 
tcaaaactat ttatgtatcc 
gacctgaata tcttacaggt 
gatgatcgag ccgattgatt 
atcgataatg atcctaacga 
actccttgat tgctcaagag 
tacaggagga gcgatctttt 
gcaggacaac aagtaaacaa 
cagatatcga aaatattcca 
tcaaaactat tatcaattgc 
tcaatgtatg gcacaaagag 
ttaaattaaa agaaccaaat 
tagtgeagge gttacgcttt 
gaaggaggaa taagatgagt 
cagaccttat ccaaatgaac 
caactcacgc tecttaegtt 
tagaaattgc tttacacact 
egcaggggea gaagatggtc 
tatcacaaga gatatcctgt 
ataatgactt gaaagttcct 
gatatcacga gtgaatcgaa 
teattgetae aagcttataa 
ttgacgaatc ttttaatgta 
cgaagtatgg aatgaagtgt 
caaacatcag aagtcttatc 
aagagttttg cgatcgtgta 
aacagacgcc gttcgacaat 
ecaagtgeta ettaaaegtt 
attgaagttg gecgaaaaca 
ttgaaacaaa atttatcaat 
taatcttgac gaatatttaa 
tttccgattt ttgatgacat 
acatcaaagc gaatcgtgat 
aaatacacgt gacacaggaa 
ttgagaattg ccagcaatgg 
gtaaagaaac aacaagctcc 
tgcttctgaa aaagaaacaa 
cgatataaag gtaaaaaggg 
gaattgagaa aatgatcttt 
gcaacaatgg tagattttaa 
gcaaatatcc tcatactgaa 
tctgaatgaa gttggtgctt 
gagaagttag aagagatcac 
tcaatgatac tgtttttgca 
tgctaacagt gtgaatattc 
aagattcaac gegacaatae 
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ttagaegtag taaaaaacga 
aatcttggcg taaccctttg 
ttttgttgac attgeacagg 
gttcccgatg taaaaacatt 
catggttaga aaaagcattt 
atacacaggt gacgaagtaa 
ctattcaaag agatcgaaat 
caacctctaa caaattagaa 
atctgatcaa tacgttatta 
ttcaatatga gtaaaactga 
aagaatcgtc aaatattgtg 
aacaacaagt ctatacaacc 
acttctcaat tegggaaege 
caagtgeaac aactagtgtt 
aacaaaccaa caaggagaag 
ggtaaagega ctgccgtaac 
gaggtcaaca ageaaeggtt 
acaaatgtaa aattgttggc 
aacaggaaca ggaategtae 
ggatacacaa ctegggggag 
atctttaaaa acgaagaaac 
atgacaacac aagtttcgtt 
agaaagtttc attgeaaaag 
gaagagtege ttgattaegg 
tcaattttct tgttattcta 
aatagtaggt ggcccatctc 
ccaaatgggg caggcaatgc 
ataagatagt egggatgtat 
ggtaaggtat aatgeaggag 
atgaaaacat tegctttett 
ggaacgtgta taactacttt 
ctattgttta atagaaatta 
ggtaaattga gtgtatatgt 
atgatgtaag taactcaacc 
tgtaggagtt aaatctgact 
caaaacattc gcaatacttt 
cagccttagc aagtaacaac 
ctatgtttct gaaaaagaaa 
gataatgtaa cacagcttgg 
gcttcaaaca aattaaatat 
caatcgagta gctacaccaa 
attgtaggca caatgagtaa 
ggcatacgaa tgatgttttg 
agacgaaaag gtgeaggact 
cctattcaag tgatgtagaa 
tcagttgttt gaatgggaaa 
aatggttatc ttggtttctt 
aaatcgatca ttatcacaac 
tttaagatat gatgatgatg 
acgttaccaa gtttacatcg 
gagegcaaaa aacacctgta 
ccaaattgac gaaaataatc 
tggcaaacaa atgetccata 
taacttttct aggtatcaac 
taacaatgaa cagattgaaa 
aatcgtgtct ttggcgatga 
tacaactggc ggcaggtcaa 
atattgaaag tttcacttat 
attgtttgat tttgattatc 
cacttttact tgagagagat 
atctaaacat gecctattgg 
ggactacacc attgatgaga 
gaatcgaaga accaaacgaa 
caaccgattc tttctcaagg 
agatggaaca ggtgtaatca 
acaggcgttg aaacaaacaa 
agaacacaga cattaataaa 
aaacactgat tatgetgact 
agagaaatga acaaggaagg 
ccccgacaag cggtttgacg 
tacagatatg aattactatt 
tagttaatga tatgagtggt 
aaatgacaca ctcaaaaaat 
aactatatca aagaaatcaa 
ttttgacaaa aaataaaccg 
tgattatgga gccgatccta 



atttatttca actttagttg 
aaaatgttta aaaaaggaaa 
aacataagtt caaccctgac 
gttccacgaa attaategtg 
acttcatggg ataatttcaa 
gcgaatttga atacacgaaa 
tggegaaatt actgaatcaa 
tttatgagtt ccgcttacaa 
ttgaegcega cacagacgca 
ctttgtagga cacaaaatcg 
gcagttattg tagatagtga 
ctgaagggtt atattggaat 
tgttgctttt gttaaatcag 
gttaaaggat catctaaaga 
ttgtttcatc agcaccagca 
egtagaagge ttagaagtcg 
cttgttacgg ttacttctga 
taacgtgcct tttgataaca 
tttaattcgt ttcctgttct 
tttttagagt agataaacac 
ttatcctagt aaatggcagt 
acctttgaaa ttgatgtttt 
aacaccctca actttattat 
tagagaatac acaacaacaa 
acaagtgaag caatgecagt 
ctttttccta ttatttactt 
taattttgga gagtacatgg 
gtaacgtcgt atacaggtat 
gttcttataa gatcatgett 
ttgtgtaaaa gaagcaagaa 
agagaagctt ttccgtttaa 
cagatacaaa aggacatgta 
aaaaggttcg ttaggaattt 
attattacca atttaagtga 
atgcttctgc attcatgeaa 
cagacatggt atgggaaaca 
ccttttgttg gtttgactaa 
acggtttgaa cctcttggca 
atcaaactta tctttcacaa 
gagtatgeaa caagacttga 
acttacaaac aagaaaagca 
cgatgtatta acacgtgtga 
aattataacc aagacaaegg 
tgctagaaat aacegttata 
gaaatcagct actatgaaca 
atttgecaaa atcaattgac 
taaagaccct acacttgggt 
cctattttct ttacagcaaa 
atgataaatc aaaatgtatc 
ttttgettta gatatggegg 
attattcaaa ctgatgaaaa 
aggctgtttt tgtggataaa 
tgtagtagat aaactacgat 
aatgctaacg tagataagac 
gttcaggtaa catcttgtta 
acttgaegga aagattgacg 
tcaaaaaaag accagatgag 
taccaacctg aattatctcg 
cgttttatga cgaaacaaaa 
aggctcagaa acgatgggat 
aataaaatgt tcctatcaaa 
aacagaaatt gttaaatgag 
gcaagtagat caaacagaca 
aacacttata cagacacccc 
attatgeaac aaatatcaca 
cgacaaaaca aatcaaaata 
gatcaaaatc aaaccaaaga 
tactcgaaaa atategtaga 
cttatttctc cttgtttatg 
gtttacccgc tgtattcaaa 
agatgaagaa gtateggett 
tatttaaatt actttatcga 
ggttgtctga tggtacgtta 
aagattacaa atcttggttg 
gatgttgctg atgatcgaac 
ttgacacgtt acgtattgtt 
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10081 gcaatcaata aagttagtgg ctggaatacc 

10151 gtgtataatg gcagacatta gaacacaacc 

10221 aaagccgtta atattatgac taatagcggt 

10291 acgaaacaat gaatacctca gttcaaaatg 

10361 attaaatgta aatgttggta aactaaccaa 

10431 ggtattcgtt atgtagaggt gtaatatggc 

10501 ttaatgcctg ttacaattgc taaaaatgtt 

10571 taagaggtaa cgctagtgaa gctaaaacac 

10641 agaaattgac acagtaacat caaccgcaaa 

10711 gaacaagcga aaacaacagc aaacagtatc 

10781 cacaaaaaag tgcaactgat ctagctgttc 

10851 attaccatag gaggaaaaat aatggcaaat 

10921 atccaagtgt tcgagcagaa aacttgttag 

10991 attatatgca gctggtgata aaacaaatgc 

11061 ataaagttta ctgaaagttt gacaaaccct 

11131 caaaacgtat tggttgtttc gcaaaatatt 

11201 atcaatcaca cctgatggca aagtaaccgt 

11271 aattgcgttt tccctctaaa ataaggaggt 

11341 aaagaagaaa atcaaaagaa ttacctatcg 

11411 cattgcatgg aaatgaaaat caggagagtt 

11481 tgatgtaagt gcttatgggg ttatcgctga 

11551 gaagaaaaaa gcgaaatggg tatcactctt 

11621 ctaacaccat tgaattgaaa cgtgatgtac 

11691 gtttgaaaca atgacggcat ttaatgtaaa 

11761 ccacaaggcg ctcaaagtgg taaaggaatt 

11831 atttgtttgt tcgtaactgt actttaaatg 

11901 atttgaaaat tgtctattct ctaatatctc 

11971 atgtggcaag ggaacgatat caatactagg 

12041 ttcatttttg tacagcgatc attatcgaca 

12111 ttctggtaac acaatcgaag gtggcgtaag 

12181 aacaaccatt ttctagcata cggaaataga 

12251 ttgatgtaga tgtttattgt cgtaactcac 

12321 tgttgtttac ggacattacc gaaacttaaa 

12391 acgttgtatg gcggtggcgt taatttctat 

12461 accggtttat tcaaacggct gacaatcgag 

12531 aacaaaagta aatacaccaa tgatctataa 

12601 gtgctaacag gtccaaatgc aagtaatgta 

12671 acaaatagct agaggacaaa caatcgctaa 

12741 ggagttgtcg ccaatctcca ttgggaatcg 

12811 gatatgggtt aggtcaatgg acgcctaaaa 

12881 tgctaaagct gaaacgttgg aaggtcaagc 

12951 gataatacac ctgtttcttc tgcaggttat 

13021 atattgatgt tgctacaatt aattttatgt 

13091 acttgatctt gcacaagctt atagtaagca 

13161 ggaaccccaa tcaagaatac aaatcttgat 

13231 caggaaacgg cagaccaaat aatttccatg 

13301 aatgattgca tgttgcgatg gaacagtaac 

13371 ataaatgatg gtacttacaa tatcgtttat 

13441 ttggcgacaa agttaagaac ggacaagttt 

13511 taaaaaagat tttatgactg cgttaggatc 

13581 tttttagggc aatgttttgg agatggagat 

13651 atcttattta tctattgcta tccgatgcct 

13721 agaatatatc acacaatggt tggcagatga 

13791 gcaatgatta tcgattttgt gttaggtttt 

13861 ttaaagctaa agcaggtatc attgttaagg 

13931 agtaaaattc ggtgcagtag gtattacaat 

14001 tatagtatac taggacatat ttcagatatc 

14071 tagacggaac actcaacaga aaggacgata 

14141 ggaattgatc tttcaaaagt tccatgcgat 

14211 accctgattg tgaccgagca tttcaacaag 

14281 gcatgagagg ggtttagaag gtacacccca 

14351 attggtaaag ctgttcttat tcttgacttt 

14421 ttcttgatta tgtttataat aaaacaggcg 

14491 aactgatttt tctagtattg caaaaggcga 

14561 caaggctact ctcaaccagc gccacctaaa 

14631 gtaaaggacg tttaccagga tacaacggca 

14701 ggatctgtat gtaggtaaaa aacaggatca 

14771 gatgagttta ttttcactct tacaacaggt 

14841 aattgtctga tccaacacaa ctcgatcata 

14911 atcaatggtg tggacacctg aacaatttga 

14981 taggagtgta tagtatgaca aatagcttag 

15051 caatgcttta ggttttaatt gcctaatgtt 

15121 tataaaaaat ttgttgttaa tcgctttatt 

15191 cagaacttaa aaagattcct caatttttca 

15261 aaaaggaaaa gaattctatt gtgatgataa 

15331 gaaaaatcta atgaatatcc cgaagttcgt 
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gctacaggag atatttatct taacattaaa ggaacggagg 
aacaagtgaa gatggatcag acaatttatt tccaatttca 
acgaatgtag aaggagaatt gggtacactc aaacaaaatg 
ctgtagttac tgccaatcaa gcaaaagatt ctgtagctga 
tcgaataaca acattagaga gtacagtggc taatcttgat 
agataaaaat attcaaatgc aggataaaga tcataatcgt 
ctaacaggcg actctaatct tgaattagtt aatgctgaaa 
ttgcacaaca agctaaagaa actgctgctg gtttgtcaac 
tcaagcgttg acgaaggctg gtacagcaca acaaaccgca 
agcgcagttg caacggcagc taaaaacaca gctgattcag 
gagtaagcag tttagaggac acagcaatac aatatactgt 
aaaaatattc aaatgaagga tagcaatgac aataatttat 
atttgaccag tcgtgctgaa ttaacaatga caaattgtca 
aatctcttat ctcggtgcag taggtatgct cgaaggtatg 
gtgatcacaa cgctaccaga aggttttaga ccaataagaa 
acacaccaaa tccaacagat acaaaagaaa tggtttatgt 
aaatgacaat gtaggtaaaa tcgaatatct atccctagat 
tcatatggaa gaacgaattg atattcaaat gaacaagatg 
caccctgaaa cgaacccgaa acaagttgtt tttgatgaaa 
tcaacaattt tgttgacaca agaaaaatga caactacaat 
cggtgtaaca gattgtacac caatattaaa taaattactt 
tattttcctc cttgtgaacg tgattcatat tatcgctttg 
ctgtagttac tttcttagga tcgggagaaa cgacattaaa 
catcgaaagt ttcaatattg atggttttgc attatggttg 
ttctttaatg atactcgcaa ttacaatcgt tttgactttg 
aaggaacgta tgttgttgtt gctagaggta gaggggttac 
tcaagcaatt atcaaaacag cttttcccga tgtaaatggt 
ggtacaggtt ttagaggttt ctttgtgaaa aacaaccgta 
atgacgatga ttatcagaat gtaattaatt tctgtgaaat 
ttattatcga ggatatgcgc ataacttgca tgtccaaaac 
aacgctttgt ttgagtttca agatgtggat caagcttata 
aagtcgaggg aatgaatagt acagctattt cacgtttaat 
gattacaggt aaattatatc gttgtcaagg acatgttatc 
tgtgacttga tggcacaaga agcacctttg acggacggtt 
ttaactatga tgggtttgtt gttcgtggtt tgtctaattc 
agcacctcag actgttttct ataatcgtag aatcgatcat 
tataactagg aggatatgag atggcaactc ttacaaatga 
aatactttca aaatatggct ataataaaaa ttcacaagta 
gctggtttga acccgaacag caatgaatat ggtggaggcg 
gcaatcttta tcgccaagca caaatttgtg ggttgtctaa 
agagatcatc gctcaagggg ataaaacagg tcaatggatg 
actaaccctc agaccctttc agcatttaaa caatctgcaa 
gtcactggga acgccctggt aaacttcata tcgaagaaag 
tattgacggt agcggtggcg gtggcgtaaa acgttgctat 
cctaaaagtt tcatgagtgg acaacttttt ggcacgcatg 
atggtttgga ctttggttca attgatcacc ctggcaatga 
acatgttgga acaatgggag cattaagagc gtattttgtg 
caagaattta gttataacca gtcaaatata aaggtaaaag 
gcgcaatacg tgacgcggat catttacatt taggttttac 
ttctttcata gatgatggaa catgggaaga ccctttgaag 
actggcggag ataatgacga taacaataag gataaaaatg 
tgaatggttg gaaattttaa taaggagaaa aaggtatgat 
taatcatctt gtttatggtt tgattatatg gttaatggtt 
acaattgcca aatttaacaa ggaaatcgac tttagtagtt 
tggcagaaat ggttttagtg gtttacttta ttcctgtagc 
gtatataaca atgttggttg gtttgatttt atcagaaatt 
gatgatgata ataattggac tgattatgtt aagaagtttt 
ttaaatgatg aatggtattg atatctctag ttatcaaaca 
tttgtaaata ttaaagcaac aggcggaaca ggttatgtaa 
ctttgtcttt aggtaaaaag attggtgtgt atcattttgc 
acaagaagcg caattctttt tagataatat taagggttac 
gaagggtcaa atcagaaaga tgtaaattgg gcgaaagcat 
ttaaagcatg gttttatacg tatacagcaa acctcaatac 
ttatggttta tgggttgctg aatatggatc aaatcaacca 
acaaataatt ttccaattgt tgcctgtttt cagtttacaa 
atcttgattt gaatgttttc tatggcgatg gtaatacatg 
aattgttcct cctgaaaata aaatatttga cgccacaagt 
agcacaagcg tgttttattt tgacggagaa acgatctttg 
ttagaggaac atacaatcat gttcatggaa aagaaatccc 
tatttactta aaaatgtatg aaaagaaacc agtatataaa 
gcgttaaact tgaagagaaa aacttatact ataaccctaa 
gtttgtaata ggcgcacgtg gtataggtaa aacttatggt 
aaacacggcg aacaatttat ttatttaaga agattcaaaa 
aaacaatggc gaaagaattt cctgatcata aacttgaagt 
attaatgggt tgggctgttc cacttagtac gtggggaatt 
acaattttgt ttgatgagtt tttaattgag aaatcaaaaa 
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15401 tcacttattt accaaacgaa gctgaagcct 

15471 tacaagatgt gttatgttga gtaatgcaac 

15541 ccagatttga ataagcgttt taatctatat 

15611 actttgcaga agtgaagaga gaaacacctt 

15681 tatcaacaat gagtttgtca atgatagtga 

15751 tgcgccattg cttttgaagg gaaaatcttt 

15821 gttatgatta tcaaccaaat acaaatcatt 

15891 gctgatgaaa aattggcgaa ataattatta 

15961 cggtttgata acategttat taagaattta 

16031 attttagtag agctaccacg attagttcta 

16101 gcgatagttt tgttttggtt ctttggcgtt 

16171 ggtgtgttaa tgtagacgaa atcttttctc 

16241 aaatgtagct ataggacgtc catttctttc 

16311 cggctatatt ttaatgcttt tgttaaggtg 

16381 tataaaatac tgtgatatcg tatattggtt 

16451 ccttttggta tttgtaacgc taactgatag 

16521 cctgacaata cttttcaaga atgttaaatt 

16591 ttcggtgata cttatttccg gaacgtcgaa 

16661 tctgaaaggt tacgtttaca gtagaaacgt 

16731 caatcatttt aattcctcct atttgtccgt 

16801 tgttcaacgc ttttcattga tttcgttatt 

16871 atttatcatg tgttaacacg aactcttttg 

16941 tgttatttct gacttgatag acgctaaact 

17011 attaatgata aattgttaat catgtaaaac 

17081 ataagattgg tagcattgta tcgaattaat 

17151 atccatatct aattccttta gttcttcaaa 

17221 tcaataagat aatgtttatt gttttcggta 

17291 gaagtagaga tacctctcct ttttcagcta 

17361 aatttgatat tgataccacc aatcaaatgt 

17431 attgagaaag tccagttatc atcaaatgaa 

17501 caaattctaa atagaggaat ttactaagtt 

17571 tgaataaatt tctgtgtata cgatcggttc 

17641 atcatgtatt tacatatatg tcaatcattt 

17711 gatcctttct ttattacatc tatattatat 

17781 tgtagtttgg ggtcagttac atttgtgtta 
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tattgaacat gatggaaacg gttttccgaa gacgtacaaa 
tagtgtagtg aacccttatt tcttgtattt caatctgcag 
caagatcgag gtatattgat tgaattgtgt gattcaaaag 
ttggtagatt gattcgtgga acagaatacg aagattttag 
tacgtttatt gaaaagagaa gtaaaaatag tagtttctta 
gggtattgga tagacgctga aacaggttgt gtctatgtga 
tttatgcaat gactacgaaa gaccatgaag aaaatagatt 
tctttcaaca gtggcgaaag cattcaagaa tagttatctg 
cattatgatt tgtttaataa gatgaaaatc tggtaaccct 
ttacaatgat gaatagtaga taacatagta attgtagtct 
agtgattttt gctaacgcct ttttgtttgc ttttggatcg 
atagttcttt ctccttatac agttttaata attccctgta 
tattctaacg caaetcacta tatccatttc taggtatata 
agaggttcgg ttttgtgtat caaaacctcc caaccatcta 
ccttgtagaa tgtagccatt attccacctc ctttaaatag 
cgagaaccaa cttttacgta tgaagttact aatttcattg 
gactcgattc gggtaatagc gttgaatgag ttaacaaaag 
atcttgtaaa gtcccctcta tgatctctat tttttcattg 
aaccattcaa ttagttcgcg gtgttctttg aatgttcgtg 
aacttgttta tatccgtcat gtttcaattg ttccgcatag 
gcgatattaa tgcaatggct atcaagataa acatagttat 
taacgtaatc aatgtataaa attaattgtt ttcctccttg 
atcgttgtca tctttagtta gttgatttaa accctctaaa 
actcctttta tattaatttg atattgatac caccaatcga 
atgttatttc tgtagttttc catgaatact cggaaataag 
agacaacaaa caatattcct catcgcctac ctcatcaata 
tctatgatat gataattcat atcccactca ttaaaggggt 
ttaatgattt attgttcata tgaaacactc cttttatatt 
gattggtagc attgtattaa attaatattc tggataattt 
attgttttat tttcaagtaa ctttttagcc tcatccacct 
tatcctcatc tctaaaaatt ttcatacata ccacgttatt 
attcatgttt atcatccttt ctttattaca tatatagtat 
aattcattta ttttaatgat ttatttgatt gtttttttat 
catgtatgat tgtatctgtc aacaattaaa ttcatataaa 
tcaaaaaaag ataatattct att 
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Table 22 



Phage 182 ORFs list 



nb 


Name 


Frame 


Position 


Size 


Key words 


1 


182ORF001 


2 


5966..7780 


I 604 


; Tail protein; 


2 


182ORF002 


1 


2152..3873 I 573 


i DNA polymerase; 


3 


182ORF003 


1 


11305..12639 


I 444 




4 


182ORF004 


3 


4626..59S4 


442 


Major head protein; 


5 


182ORF005 


3 


12651 ..13700 


349 i Giycyt-Glycine endopeptidase; Lysostaphin precursor; 


6 


182ORF006 


1 


14995..16026 


343 I Encapsidation protein; ATG/GTP-binding site motif A; 


7 


182ORF007 


1 


7795..877S 


326 ' Upper collar protein; 


8 


182ORF008 


2 


14105..14983 


292 i Lysozyme; Muramidase; 


9 


182ORF010 


2 


1310..2155 


281 ! Terminal protein; 


10 


182ORF009 


2 


8765..9601 


278 i Lower collar protein; 


11 


182ORF011 


1 


9607..10158 


183 i Pre-neck appendage protein; 


12 


182ORF012 


3 


10872..11294 


140 




13 


182ORF013 


1 


10456..10860 


134 




14 


182ORF014 


3 


13716..14108 


130 


Lysis protein; 


15 


182ORF015 


2 


854..1225 


123 


Early protein; 


16 


182ORF018 


-2 


16429..16737 


102 




17 


182ORF020 


3 


10158..10454 


98 


Leucine-zipper motif; 


18 


182ORF019 


3 


4323..4613 


96 


Head protein; 


19 


182ORF016 


-3 


16749..17033 


94 




20 


182ORF022 


1 


12868..13149 


93 




21 


182ORF023 


-2 


11914..12189 


91 




22 


182ORF017 


1 


154..426 


90 




23 


182ORF024 


3 


6174..6446 


90 




24 


182ORF025 


2 


548..814 


88 


Early protein; 


25 


182ORF026 


-3 


12999..13259 


86 




26 


182ORF027 


-1 


14642..14896 


84 




27 


182ORF028 


3 


14430..14672 


80 




28 


182ORF021 


-3 


17106..17339 


77 




29 


182ORF030 


-1 


16199..16429 


76 I 


30 


182ORF031 


-3 


8379..8603 


74 I 


31 


182ORF032 


-1 


11195..11413 


72 




32 


182ORF033 


-1 


4727.-4942 


71 




33 


182ORF034 


-1 


5951..6160 


69 




34 


182ORF029 


-3 


17412..17606 


64 




35 


182ORF035 


-3 


15570.. 15758 


62 




36 


182ORF036 


-3 


2127..2315 


62 




37 


182ORF037 


-1 


12095.. 12280 


61 i 


38 


182ORF038 


3 


14769..14951 


60 




39 


182ORF039 


2 


9992..10171 


59 




40 


182ORF040 


-3 


16029.. 16202 


57 




41 


182ORF041 


1 


3886..4056 


56 


Early protein; 


42 


182ORF042 


-3 


10671..10832 


53 




43 


182ORF043 


-3 


10491..10652 


53 




44 


182ORF044 


-1 


6299. .6457 


52 




45 


182ORF045 


-2 


6571. .6729 


52 




46 


182ORF046 


2 


237Z.2527 


51 




47 


182ORF047 


-2 


13201 ..13353 


50 




48 


io2OKr04o 


•a 
~<i 


3243.-3395 


50 




49 


182ORF049 


3 


1578..1724 


48 




50 


182ORF050 


2 


8012..8155 


47 




51 


182ORF051 


3 


9390..9530 


46 




52 


182ORF052 


1 


4096..4233 


45 




53 


182ORF053 


2 


15656..15793 


45 




54 


182ORF054 


-2 


8002..8136 


44 




55 


182ORF055 


2 


8324..8455 


43 




56 


182ORF056 


3 


6549..6680 


43 




57 


182ORF057 


-3 


8133..8264 


43 




58 


182ORF058 


-1 


5048..5176 


42 




59 


182ORF059 


-2 


15748.. 15876 


42 




60 


182ORF060 


-3 


15276.. 15404 


42 




61 


182ORF061 


-3 


1974..2102 


42 




62 


182ORF062 


-2 


1867..1992 


41 




63 


182ORF063 


-3 


14181 ..14306 


41 




64 


182ORF064 


-2 


7234..735S 


40 
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65 


182ORF065 


-2 


3460..3582 


40 I 


66 


182ORF066 


1 


4234..4353 


39 I 


67 


182ORF067 


-1 


13763..13882 


39 I 


68 


182ORF068 


-1 


7148..7267 


39 '* 


69 


182ORF069 


-3 


4908..5027 


39 > 


70 


182ORF070 


-3 


912..1031 


39 I 


71 


182ORF071 


2 


11741..11857 


38 I 


72 


182ORF072 


-3 


11610..11723 


37 I 


73 


182ORF073 


-3 


2763.-2876 


37 ! 


74 


182ORF074 


-1 


8813..8923 


36 i 


75 


182ORF075 


-3 


7353.-7463 


36 I 


76 


182ORF076 


-3 


2316..2426 


36 ! 


77 


182ORF077 


2 


1185&.11965 


35 : 


78 


182ORF078 


-2 


7564..7671 


35 ! 


79 


182ORF079 


-2 


7381. .7488 


35 I 


80 


182ORF080 


-2 


4372.^4473 


33 i 
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Table 23 

Predicted amino acid sequences of ORFs from phage 182 

182ORF001 



5966 atggcaagaaggtatacaaatgtaaaattgttggctaacgtgccttttgataacacctatacacacacaagatggtttaaaact 

1 MARRYTNVKLLANVPFDNTYTHTRWFKT 

6050 caacaggaacaggaatcgtactttaattcgtttcctgttcttaacgagaatagagatcgttcttatcaaagggatacacaactc 

29 QQBQESYFNSFPVLNENRDCSYQRDTQL 

6134 gggggagtttttagagtagataaacacaaagacgccttatatgcttgtaactatctcatctttaaaaacgaagaaacttatcct 

57 GGVFRVDKHKDALYACNYLI FKNEETYP 

6218 agtaaatggcagtatgcctttgttactgatattgaatataagaatgacaacacaagtttcgttacctttgaaattgatgtttta 

85 SKWQYAFVTDIEYKNDNTS FVTFEIDVL 

6302 caaacttatcgtttcgatattggtatacgagaaagtttcattgcaaaagaacaccctcaactttattattcgaatggaatacct 

113 QTYRFDIGIRESFIAKEHPQLYYSNGIP 

6386 ttcattaatacaattgaagagtcgcttgattacggtagagaatacacaacaacaaatgtaacaacttttcatcctaacgatgga 

141 FINTIEESLDYGREYTTTNVTTFHPNDG 

6470 gtcaattttcttgttattctaacaagtgaagcaatgccagttggagataaggaagataaatcaggaggatcaatagtaggtggc 

169 VNFLVI LTSEAMPVGDKEDKSGGSIVGG 

6554 ccatctcctttttcctattatttacttcctatcaattcaagtggggaggcatacaaaccaaatggggcaggcaatgctaatttt 

197 PSPFSYYLLPINSSGEVYKPNGAGNANF 

6638 ggagagtacatggcgtttcttacaacgaaagaaccttttttaaataagatagtcgggatgtatgtaacgtcgtatacaggtata 

225 GEYMAFLTTKEPFLNKIVGMYVTSYTGI 

6722 ccattcattgtggatcacgcgaacaaaacggtaaggtataatgcaggaggttcttataagatcatgcttccaacctacgctagt 

253 PF I V D H A N K.TVRYNAGGSY K IML PTYAS 

6806 gatccaacaggaacaatgaaaacattcgctttcttttgtgtaaaagaagcaagaacattcgtacctaaaagaattgatcttgta 

281 DPTGTMKTFAFFCVKEARTFVPKRIDLV 

6890 gggaacgtgtataactactttagagaagcttttccgtttaatgttaaggaatcaaaactatttatgtatccctattgtttaata 

309 GNVYNYFREAFPFNVKESK LFMYPYCLI 

6974 gaaattacagatacaaaaggacatgtaatgactttaagacctgaatatcttacaggtggtaaattgagtgtatatgtaaaaggt 

337 EITDTKGHVMTLRPEYLTGGKLSVYVKG 

7058 tcgttaggaatttctaataaagtgacgatcgagccgattgattatgatgtaagtaactcaaccattattaccaatttaagtgac 

365 SLGISNKVMIEPIDYDVSNSTIITNLSD 

7142 aagatgttaatcgataatgatcctaacgatgtaggagttaaatctgactatgcttctgcattcatgcaaggaaacaaaaactcc 

393 KML I DND PNDVGVKSDYASAFMQGNKNS 

7226 ttgattgctcaagagcaaaacattcgcaatactttcagacatggtatgggaaacagtgcaatgagtacaggaggagcgatcttt 

421 LIAQEQNIRNTFRHGMGNSAMSTGGAIF 

7310 tcagccttagcaagtaacaacccttttgttggtttgactaacatcatgggagcaggacaacaagtaaacaactatgtttctgaa 

449 SALASNNPFVGLTNIMGAGQQVNNYVSE 

7394 aaagaaaacggtttgaacctcttggcaggtaaagtggcagatatcgaaaatattccagataatgtaacacagcttggatcaaac 

477 KENGLNLLAGKVADIENI P DNVTQLGSN 

7478 ttacctttcacaacaggaaactttcaaaactattatcaattgcgcttcaaacaaattaaatatgagtatgcaacaagacttgat 

505 LSFTTGNFQNYYQLRFKQI KYEYATRLD 

7562 cgttacttctcaatgtatggcacaaagagcaatcgagtagctacaccaaacttacaaacaagaaaagcatggaacttcattaaa 

533 RYFSMYGTKSNRVATPNLQTRKAWNFIK 

7646 ttaaaagaaccaaatattgtaggcacaatgagtaacgatgtactaacacgtgtgaaacaaatttttagtgcaggcgttacgctt 

561 LKE PN I VG TMSNDVLTRVKQ I FSAGVTL 

7730 tggcatacgaatgatgttttgaattataaccaagacaacggagatgtatag 7780 

589 WHTNDVLNYN. QDNGDV* 
182ORF002 

2152 atgattaagaaatatactggcgactttgaaacaacaactgatctcaacgattgtcgtgtatggtcgtggggcgtatgcgatata 

1 MIKKYTGDFETTTDLNDCRVWSWGVCDI 

2236 gacaacgttgacaatatgacgttcggtttagaaatcgattctttttttgagtggtgtaaaatgcaaggcagcacagacatttat 

29 DNVDNMTFGLEIDSFFEWCKMQGSTDIY 

2320 ttccacaacgaaaaatttgacggagagtttatgctttcatggttattcaaaaatggtttcaaatggtgtaaagaagcaaaagaa 

57 FHNEKFDGEFMLSWLFKNGFKWCKEAKE 

2404 gatcgaacattctccacactcataccaaatatgggtcaatggtacgctttggaaatttgttgggaagttaattacacaacaaca 

85 DRTFSTLISNMGQWYALE I CWEVNYTTT 

2488 aaatcaggtaaaacgaaaaaagagaaatctcgaacaataatttatgatagccttaaaaaatatccttttccagtgaaacaaatt 

113 KSGKTKKEKSRTI I YDSLKKYPFPVKQI 

2572 gcagaagcttttaattttcctataaaaaaaggcgaaatagattatacaaaagaaagacctattggttacaaaccaacaaaagat 

141 AEAFNFPIKKGEIDYTKERPIGYKPTKD 

2656 gaatgggagtatttaaagaacgacattcagattatggcgatggcattaaaaattcaattcgatcaaggactaactcgaatgact 

169 EWEYLKNDIQIMAMALKIQFDQGLTRMT 

2740 agaggaagcgacgctttaggcgattacaaagattggctaaaagctacacatggaaaatcaactttcaaacaatggtttcctatt 

197 RGSDALGDYKDWLKATHG KSTFK-Q W "JP :.. 2. I ? 

2824 ttgtctctagggtttgacaaagacttacgtaaagcatacaaaggcggcttcacttgggtaaacaaagtttttcaagggaaagaa 

225 LSLGFDKDIiRKAYKGGFTWVNKVFljGKE 

2908 ataggtgacggcattgtctttgatgtcaactctttgtatccctctcaaatgtacgtaagacctttaccatatggaacacctcta 

253 IGDG I VFDVNSLYP SQMYVRPLPYGTPL 

2992 ttctacgaaggagaatacaaaccgaacaacgactatccgctgtacattcaaaatatcaaagtaagattccgtttaaaggagggt 

281 FYEGEYK PNNDYPLY IQN I KVRFRLKEG 

3076 catattccaaccattcaagttaagcaaagttcattattcattcaaaacgaatatcttgaatcaagtgtaaacaagttaggagtc 
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309 YIPTIQVKQSSLFI1QNEYLESSVNKLGV 

3160 gacgaattaatcgatcttactcttacaaatgttgacctagaattattttttgaacactacgatattttagagatacattacact 

337 DELI DLTLTNVDLELFFEHYD I LBIHYT 

3244 tacggatatatgttcaaagcttcttgtgatatgttcaaaggctggatcgataaatggatcgaagtaaagaacaccaccgaaggg 

365 YGYMFKASCDMFKGWIDKWI EVKNTTEG 

3328 gctagaaaagctaacgccaaaggtatgttaaatagcctgtatggaaagttcggaacaaaccctgacattacaggaaaagtgcct 

393 ARKANAKGMLNSLYGKFGTN PDITGKVP 

3412 tacatgggcgaggacggcattgttcgattgacactaggagaagaagaattaagagatcctgtttatgttccgcttgctagtttt 

421 YMGEDGIVRLTLGEEELRDPVYVPLASF 

3496 gtgacggcttggggtagatatactaccattacaaccgctcaaaaatgttttgatcgcattatttattgtgatacagatagcatt 

449 VTAWGRYTTITTAQKCFDRI I YCDTDSI 

3590 catctagtaggaacagaagttccagaagcaatcgatcacttggttgatcctaaaaaacttggttattgggggcatgaaagcaca 

477 HLVGTEVPEAIDHLVDPKKLGYWGHEST 

3664 tttcaacgagcaaaattcattcggcagaaaacatacgtagaagaaattgatggcgaattaaatgtaaagtgtgctggtatgcca 

505 FQRAKFIRQKTYVEEIDGELNVKCAGMP 

3748 gatcgaataaaagagattgtaacttttgacaattttgaagttggtttttcaagctatggaaagttgctacctaaaagaacacaa 

533 DRIKEIVTFDNFEVGFSSYGKLLPKRTQ 

3832 ggtggcgtggtattagtagacacaatgtttacaatcaaataa 3873 

561 GGVVLVDTMFTIK* 
182ORF003 

11305 atggaagaacgaattgatattcaaatgaacaagatgaaagaagaaaatcaaaagaattacctattgcaccctgaaacgaacccg 

1 MEERIDIQ MNKMKEENQKNYLLHPETNP 

11389 aaacaagttgtttttgatgaaacattgcatggaaatgaaaatcaggagagtttcaacaattttgttgacacaagaaaaatgaca 

29 KQVVFDETLHGNENQESFNNFVDTRKMT 

11473 actacaattgatgtaagtgcttatggggttatcgctgacggtgtaacagattgtacaccaatattaaataaattacttgaagaa 

57 TTIDVSAYGVIADGVTDCTPI LNKLLEE 

11557 aaaagcgaaatgggtatcactttttattttcctccttgtgaacgtgattcatattatcgctttgctaacaccattgaattgaaa 

85 KSEMGITFYFPPCERDSYYRFANTIELK 

11641 cgtgatgtacctgtagttactttcttaggatcgggagaaacgacattaaagtttgaaacaatgacggcatttaatgtaaacatc 

113 RDVPVVTFLGSGETTLKFETMTAFNVNI 

11725 gaaagtttcaatattgatggttttgcattatggttgccacaaggcgctcaaagtggtaaaggaattttctttaatgatactcgc 

141 ESFNIDGFALWLPQGAQSGKG IFFNDTR 

11809 aattacaatcgttttgactttgatttgtttgttcgtaactgtactttaaatgaaggaacgtatgttgttgttgctagaggtaga 

169 NYNR FDFDLFVRNCTLNEGTYVVVARGR 

11893 ggggttacatttgaaaattgtctattctctaatatctctcaagcaattatcaaaacagcttttcccgatgtaaatggtatgtgg 

197 GVTFENCLFSNISQAIIKTAFPDVNGMW 

11977 caagggaacgatatcaatactaggggtacaggttttagaggtttctttgtgaaaaacaaccgcattcatttttgtacagcgatc 

225 QGHD INTRGTGFRGFFVKNKR I H* F C T A I 

12061 attatcgacaatgacgatgattatcagaatgtaattaatttctgtgaaatttctggtaacacaatcgaaggtggcgtaagttat 

253 I IDNDDDYQNVINFCEISGNT I EGGVSY 

12145 tatcgaggatatgcgcataacttgcatgtccaaaacaacaaccattttctagcatacggaaatagaaacgctttgtttgagttt 

281 YRGYAHNLHVQNNNHFLAYGNRNALFBF 

12229 caagatgtggatcaagcttatattgatgtagatgtttattgtcgtaactcacaagtcgagggaatgaatagtacagctatttca 

309 QDVDQAYIDVDVYCRNSQVEGMNSTAIS 

12313 cgtttaattgttgtttacggacattaccgaaacttaaagattacaggtaaattatatcgttgtcaaggacatgttatcacgttg 

337 RLIVVYGHYRNLKITGKLYRCQGHVITL 

12397 tatggcggtggcgttaatttctattgtgacttgatggcacaagaagcacctttgacggacggttaccggtttattcaaacggct 

365 YGGGVNFYCDLMAQEAPLTDGYRFIQTA 

12481 gacaatcgagttaactatgatgggtttgttgttcgtggtttgtctaattcaacaaaagtaaatacaccaatgatctataaagca 

393 DNRVNYDG FVVRGLSNSTKVNTPMI YKA 

12565 cctcagactgttttctataatcgtagaatcgatcatgtgctaacaggtccaaatgcaagtaatgtatataactag 12639 

421 PQTVFYNRRIDHVLTGPNASNVYN* 
182ORF004 , 

4626 atggctgacaaaatcacagaacaagatgttcttcgtgccacaaatgtagaaacaccagtacaattaatgactgctatttataat 

1 MADK ITEQOVLRATNVETPVQLMTAIYN 

4710 agttcatcatctctttttcaggcgaacgtacctatgccaaatgcagataacatcgaagcggttggtgcagggatcacacgttta 

29 SSSSLFQANVPMPNADNIEAVGAGITRL 

4794 gacgtagtaaaaaacgaatttatttcaactttagttgaccgtattggtaaagtagttatccgatacaaatcttggcgtaaccct 

57 DVVKNEFISTLVDRIGKVVIRYKSWRNP 

4878 ttgaaaatgtttaaaaaaggaaacatgcctttaggtcgaacgattgaagaaacttttgttgacattgcacaggaacataagttc 

85 LKMFKKGNMPLGRTI EEI FVD IAQEHKF 

4962 aaccctgacgagtctgttacaggggtatccaaacaggaagttcccgatgtaaaaacattgttccacgaaattaatcgtgaaggt 

113 NPDESVTGVFKQEVPDVKTLFHEINREG 

5046 tactacaaacaaacgatccaagaagcatggttagaaaaagcatttacttcatgggataatttcaatagtttcgttgctggtgta 

141 YYKQTI QEAWLEKAFTSWDNFNSFVAGV 

5130 atgaacgctttatacacaggtgacgaagtaagcgaatttgaatacacgaaattattaatagcaaactaccaagaaaaagagcta 

169 MNALYTGDEVSEFEYTKLLIANYQEKEL 

5214 ttcaaagagatcgaaattggcgaaattactgaatcaaatgcaaaagaatttatccgtaagatcaaatcaacctctaacaaatta 

197 FKEIEIGEITESNAKEFIRKIKS'TS _& TC L 

5298 gaatttatgagttccgcttacaacgctcaaggagttaaaacatctacctcaaaatctgatcaatacgctatfeattgacgccgac 

225 EFMSSAYNAQGVKTSTSKSDQYVIIDAD 

5382 acagacgcaaccattgacgttgacgttttagcagcggcattcaatatgagtaaaactgactttgtaggacacaaaatcgttatt 

253 TDAT IDVDVLAAAFNMSKTDFVGHKIVI 

5466 gatgagtetcctaaaaaagaaggcgaagaatcgtcaaatattgtggcagttattgtagatagtgaatggtttatgatctacgac 
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281 DEFPKKEGEESSNI-VAVIVDS EWFM X YO 

5550 aaattgtacaaaacaacaagtctatacaaccctgaagggttatattggaattattggttgcaccaccaccaactatattctact 
309 KLYKTTSLYNPEGLYWNYWLHHHQLYST 
5634 tctcaattcgggaacgctgttgcttttgttaaatcagcaacaaaacctgtcacaaaagttgcttttgcaagtgcaacaactagt 
337 SQFGNAVAFVKSATKPVTKVAFASATTS 

5718 gttgttaaaggatcatctaaagatatcgcattgacatttacaccagtagaagcaacaaaccaacaaggagaagttgcttcatca 

365 VVKGSSKDIALTFTPVEATNQQGEVVSS 

5802 gcaccagcattggttaaggcaaccgtaaaacaaacagcaggtaaagcgactgccgtaaccgtagaaggcttagaagtcggtcaa 

393 APALVKATVKQTAGKATAVTVEGLEVGQ 
5886 tcattagtaacattcacagctatcggaggtcaacaagcaacggttcttgttacggttacttctgactaa 5954 

421 SLVTFTAIGGQQATVLVTVTSD* 
182ORF005 

12651 atggcaactcttacaaatgaacaaatagctagaggacaaacaatcgctaaaatactttcaaaatatggctataataaaaattca 

1 MATLTNEQIARGQTXAKX L SKYGYNKNS 

12735 caagtaggagttgtcgccaatctccattgggaatcggctggtttgaacccgaacagcaatgaatatggtggaggcggatatggg 

29 QVGVVANLHWESAGLNPNSNEYGGGGYG 

12819 ttaggtcaatggacgcctaaaagcaatctttatcgccaagcacaaatttgtgggttgtctaatgctaaagctgaaacgttggaa 

57 LGQWTPKSNLYRQAQICGLSNAKAETLE 

12903 ggtcaagcagagatcatcgctcaaggggataaaacaggtcaatggatggataatacacctgtttcttctgcaggttatactaac 

85 GQAE I IAQGDKTGQWMDNTPVS SAG YTN 

12987 cctcagaccctttcagcatttaaacaatctgcaaatattgatgttgctacaattaattttatgtgtcactgggaacgccctggt 

113 PQTLSAFKQSANI DVATINFMCHWE RPG 

13071 aaacttcatatcgaagaaagacttgatcttgcacaagcttatagtaagcatattgacggtagcggtggcggtggcgtaaaacgt 

141 KLKI EERLDLAQAYSKH IDGSGGGGVKR 

13155 tgctatggaaccccaatcaagaatacaaatcttgatcctaaaagtttcatgagtggacaactttttggcacgcatgcaggaaac 

169 CYGT P I K N T N L D P KS FMSGQL FGT HAGN 

13239 ggcagaccaaataatttccatgatggtttggactttggttcaattgatcaccctggcaatgaaatgattgcatgttgcgatgga 

197 GRPNNFHDGLDFGSIDHPGNEM IACCDG 

13323 acagtaacacatgttggaacaatgggagcattaagagcgtattttgtgataaatgatggtacttacaatatcgtttatcaagaa 

225 TVTHVGTMGALRAYFVINDGTYNIVYQE 

13407 tttagttataaccagtcaaatataaaggtaaaagttggcgacaaagttaagaacggacaagtttgcgcaatacgtgacgcggat 

253 FSYNQSNI KVKVGDKVKNGQVCAIRDAD 

13491 catttacatttaggttttactaaaaaagattttatgactgcgttaggatcttctttcatagatgatggaacatgggaagaccct 

281 HLHLGFTKKDFMTALGSSFIDDGTWEDP 

13575 ttgaagtttttagggcaatgttttggagatggagatactggcggagataatgacgataacaataaggataaaaatgatcttatt 

309 LKFLGQCFGDGDTGGDNDDNN KDKNDLI 

13659 tatctattgctatccgatgccttgaatggttggaaattttaa 13700 

337 YLLLSDALNGWKF* 
182ORF006 

14995 atgacaaatagcttaggcgttaaacttgaagagaaaaacttatactataaccctaacaatgctttaggttttaattgcctaatg 

1 MTNSLGVKLEEKNLYYNPNNALGFNCLM 

15079 ttgtttgtaataggcgcacgtggtataggtaaaacttatggttataaaaaatttgttgttaatcgctttattaaacacggcgaa 

29 LFVIGARG IGKTYGYKKFVVNRF I K H G E 

15163 caatttatttatttaagaagattcaaaacagaacttaaaaagattcctcaatttttcaaaacaatggcgaaagaatttcctgat 

57 QFIYLRRFKTELKKI PQFFKTMAKE FPD 

15247 cataaacttgaagtaaaaggaaaagaattctattgtgatgataaattaatgggttgggctgttccacttagtacgtggggaatt 

85 HKLEVKGKEFYCDDKLMGWAVPLSTWGI 

15331 gaaaaatctaatgaatatcccgaagctcgtacaattttgtttgatgagtttttaattgagaaatcaaaaatcacttatttacca 

113 EKSNEYPEVRTILFDEFLIEKSKITYLP 

15415 aacgaagctgaagccttattgaacatgatggaaacggttttccgaagacgtacaaatacaagatgtgttatgttgagtaatgca 

141 NEAEALLNMMETVFRRRTNTRCVMLSNA 

15499 actagtgtagtgaacccttatttcttgtatttcaatctgcagccagatttgaataagcgttttaatctatatcaagatcgaggt 

163 TSVVNPYFLY FNLQPDLNKRFNLYQDRG 

15583 atattgattgaattgtgtgattcaaaagactttgcagaagtgaagagagaaacaccttttggtagattgattcgtggaacagaa 

197 ILIBLCDSKDFAEVKRETPFGRLI RGTE 

15667 tacgaagattttagtatcaacaatgagtttgtcaatgatagtgatacgtttattgaaaagagaagcaaaaatagtagtttctta 

225 YEOFS INNEFVNDSDTFI EKRSKNS SFL 

15751 tgcgccattgcttttgaagggaaaatctttgggtattggatagacgctgaaacaggttgtgtctatgtgagttatgattatcaa 

253 CAIAFEGKIFGYWIDAETGCVYVSYDYQ 

15835 ccaaatacaaatcatttttatgcaatgactacgaaagaccatgaagaaaatagattgctgatgaaaaattggcgaaacaattat 

281 PNTNH FYAMTTKDHEENRLLM KNWRNNY 

15919 tatctttcaacagtggcgaaagcattcaagaatagttatctgcggtttgataacattgttattaagaatttacattatgatttg 

309 YLSTVAKAFKNSYLRFDNIVI KNLHYDL 

16003 tttaataagatgaaaatctggtaa 16026 

337 FNKMKIW * 



182ORF007 

7795 atgagtagacgaaaaggtgcaggacttgctagaaataaccgttatacagcaaaaagcagaccttatccaaatgaaocc^attck 

1 MSRRKGAGLARNNRYTAKSR PYPM_E T Y S 

7879 agtgatgtagaagaaatcagctactatgaacattatcgtagacaactcacgctccttacgttccagttgtttgaatgggaaaat 

29 SDVEEISYYEHYRRQLTLLTFQLFEWEN 

7963 ttgccaaaatcaattgaccctcgttatttagaaattgctttacacactaatggttatcttggtttctttaaagaccctacactt 

57 LPKSIDPRYLEIALHTNGYLG FFKDPTL 

8047 gggttcatggtttgcgcaggggcagaagacggtcaaatcgatcattatcacaaccctattttctttacagcaaacgaagcaatg 
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85 GPMVCAGABDGQI D H Y H N P I FFTANEAM 

8131 tatcacaagagatatcctgttttaagatatgatgatgatgatgataaatcaaaatgtatcatgttgtataataatgacttgaaa 

113 YHKRYPVLRYDDDDDKS KCIMLYNN DLK 

8215 gttcctacgttaccaagtttacatcgttttgctttagatatggcggacataaaccagatatcacgagtgaatcgaagagcgcaa 

141 VPTLPSLHRFALDMADINQISRVNRRAQ 

8299 aaaacacctgtaattattcaaactgatgaaaagaaatacttctcattgctacaagcttataaccaaattgacgaaaataatcag 

169 KTPVIIQTDEKKYFSLLQAYNQIDENNQ 

8383 gctgttttcgtggataaagatatggagtttgacgaatcttttaatgtatggcaaacaaatgctccatatgtagtagataaacta 

197 AVFVDKDMEFDESFNVWQTNAPYVVDKL 

8467 cgatcagaattgaacgaagtatggaatgaagtgttaacttttctaggtatcaacaatgctaacgtagataagactgcacgtgta 

225 RSELNEVWNEVLTFLGINNANVDKTARV 

8551 caaacatcagaagtcttatctaacaatgaacagattgaaagttcaggtaacatcttgttaaaatcaagaaaagagttttgcgat 

253 QTSEVLSNNEQIESSGN ILLKSRKE FCD 

8635 cgtgtaaatcgtgtctttggcgatgaacttgacggaaagattgacgtgaagtttagaacagacgccgttcgacaattacaactg 

281 RVNRVFGDELDGKIDVKFRTDAVRQLQL 

8719 gcggcaggtcaatcaaaaaaagaccagatgagtggagggttgccaagtgctacttaa 8775 

309 AAGQSKKDQMSGGLPSAT* 
182ORF008 

14105 atgatgaatggtattgatatctctagttatcaaacaggaattgatctttcaaaagttccatgcgattttgtaaatattaaagca 

1 MMNGIDISSYQTGIDLSKVPCDFVNIKA 

14189 acaggcggaacaggttatgtaaaccctgattgtgaccgagcatttcaacaagctttgtctttaggtaaaaagattggtgtgtat 

29 TGGTGYVNPDCDRAFQQALSLGKKIGVY 

14273 cattttgcgcatgagaggggtttagaaggtacacctcaacaagaagcgcaattctttttagataatattaagggttacattggt 

57 HFAHERGLEGTPQQEAQFFLDNIKGYIG 

14357 aaagctgttcttattcttgactttgaagggtcaaatcagaaagatgtaaattgggcgaaagcatttcttgattatgtttataat 

85 KAVL I LDFEGSNQKDVNWAKA F L D Y V Y N 

14441 aaaacaggcgttaaagcatggttttatacgtatacagcaaacctcaatacaactgatttttctagtattgcaaaaggcgattat 

113 KTGVKAWFYTYTANLNTTDFS S IAKGDY 

14525 ggtttatgggttgctgaatatggatcaaatcaaccacaaggctactctcaaccagcgccacctaaaacaaataattttccaatt 

141 GLWVAEYGSNQPQGYSQ PAPPKTNNF P I 

14609 gttgcctgttttcagttcacaagtaaaggacgttcaccaggatacaacggcaaccttgatttgaatgttttctatggcgatggt 

169 VACFQFTS KGRLPGYNGNLDLNVFYGDG 

14693 aatacatgggatctgtatgtaggtaaaaaacaggatcaaattgttcctcctgaaaataaaatatttgacgccacaagtgatgag 

197 NTWDLYVGKKQDQIVPPENKI FDATSDE 

14777 tttattttcactcttacaacaggtagcacaagcgtgttttattttgacggagaaacgatctttgaattgtctgatccaacacaa 

225 FIFTLTTGSTSVFYFDGETIFELSDPTQ 

14861 ctcgatcatattagaggaacatacaatcatgttcatggaaaagaaatcccatcaatggtgtggacacctgaacaatttgatatt 

253 LDH IRGTYNHVHGKE I PSMVWTPEQFD I 

14945 tact taaaaatgtatgaaaagaaaccagta tat aaatag 14983 

281 YLKMYEKKPVYK* 
182ORF009 

8765 gtgctacttaaacgttatattgaaagtttcacttattaccaacctgaattatctcgaaaagaacgtattgaagttggccgaaaa 

1 VLLKRYIESFTYYQPELSRKERIEVGRK 

8849 caattgtttgattttgattatccgttttatgacgaaacaaaacgagcagaatttgaaacaaaatttatcaatcacttttacttg 

29 QLFDFDYPFYDETKRAEFETKFINHFYL 

8933 agagagataggctcagaaacgatgggatcatttaagtttaatcttgacgaatatttaaatctaaacatgccctattggaataaa 

57 REIGSETMGSFKFNLDEYLNLNMPYWNK 

9017 atgttcctatcaaatcttgaagagtttccgatttttgatgacatggactacaccattgatgagaaacagaaattgttaaatgag 

85 MFLSNLEEFPIFDDMDYTIDEKQKLLNE 

9101 attgatacaaacatcaaagcgaatcgtgatgaatcgaagaaccaaacgaagcaagtagatcaaacagacaacagaaacaaaaat 

113 IDTN I KANRDESKNQTKQVDQTDNRNKN 

9185 acacgtgacacaggaacaaccgattctttctcaaggaacacttatacagacacccctcaaaaagatttgagaattgccagcaat 

141 TRDTGTTDSFSRNTYTDTPQKDLRIASN 

9269 ggagatggaacaggtgtaatcaattatgcaacaaatatcacagaagatttgagtaaagaaacaacaagctccacaggcgttgaa 

169 GDGTGVINYATNITEDLSKETTSSTGVE 

9353 acaaacaacgacaaaacaaatcaaaatacacgaagcaatgcttctgaaaaagaaacaaagaacacagacattaataaagatcaa 

197 TNNDKTNQNTRSNASEKETKNTDINKDQ 

9437 aatcaaaccaaagatacgattacacgatataaaggtaaaaagggaaacactgattatgctgacttactcgaaaaatatcgtaga 

225 NQTKDTITRYKGKKGNTDYADLLEKYRR 

9521 agtgttttgagaattgagaaaatgatctttagagaaatgaacaaggaaggcttatttctccttgtttatggagggaggtag 
9601 

253 SVLRI EKMIFREMNKEGLFLLVYGGR* 



182ORF010 

1310 ttgaccgtaagaatatcaaagaatgatagagccaagttagagaaaatctacggtaaatctaacaaagctcgtaaaaaatacaat 

1 LTVRISKNDRAKLEKIYGKSNKARKKYN 

1394 cgtttaagacaaaaaggagttgaggaaaggcaacttccaactgttccaacatcaaagaaaagacttattgactacgtaaaatca 

29 RLRQKGVEE R Q LPTVPTSKKRLIDYVKS 

1478 acaaatatgagtcgtagtgattttaacaagatgttagacgagttggtagattttgcacaaccttacaacgagaattaoattttt 

57 TNMSRSDFNKMLDELVDFAQPYNEJIYIF 

1562 gagatcaacaagcgaaatgttgcaatctcaagagcgcaaatcaaagaagcgcaaattaaaacagagcaagctcaaaaagcgaaa 

as EINKRNVAISRAQI KEAQIKTEQAQKAK 

1646 gaagaacactacaaagagcttaacaaagttgaagttaagaagcccacagaaaacacaattgtcacaccaactattttaacagag 

113 EEHYKELNKVEVKKPTENTIVTPTILTE 

1730 ttaggtgctgacttaccttttcaagcaataccagattttaatattgacgctttcacttctccagaaggagttcagtcttattta 



WO 00/32825 



PCT/IB99/02040 



322 



141 LGADLPFQAI PDFNIDAFTS PEGVQSYL 

1614 gaaaatataggaaaacaagacgaacaatattttgacgaaagagaccaactt tat tacgacaatctcagacaagcgatgtt tact 

169 ENIGKQDEQYFDERDQLYYDNFRQAMFT 

1898 attttcaattcagacgctgacgatattgttcgtttacttgactcaatggggcttgatctatttatgaaaacatatgttagtaac 

197 IFNSDADDIVRLLDSMGLDLFMKTYVSN 

1982 ttcttagacatgaaccttgactacatttatgacgaagcagaagtacaacagaaaaaagaacaagtttacagtaagattgcaaaa 

225 FLDMNLDYIYDEAEVQQKKEQVYSKIAK 

2066 gtgatcgagtctgaaacaggtggagaagtcccctcatataaccccacgaagaacatcacaattaattcagaaacaggagaagaa 

253 VIESETGGEVPSYNPTKNITINSETGEE 

2150 ttatga 2155 

281 L * 
182ORF011 

9607 atggtagattttaaccccgacaagcggtttgacggtttacccgctgtattcaaagaacgctttagcaaatatcctcatactgaa 

1 MVDFNPDKRFDGLPAVFKERFSKYPHTE 

9691 tacagatatgaattactattagatgaagaagtatcggctttaattgcctatctgaatgaagttggtgctttagttaatgatatg 

29 YRYELLIiDEEVSALIAYLNEV GALVNDM 

9775 agtggttatttaaattactttatcgaacattttgttgagaagttagaagagatcacaaatgacacactcaaaaaatggttgtct 

57 SGYLNYFIEHFVEKLEEITNDTLKKWLS 

9859 gatggtacgttagaaaatttaatcaatgatactgtttttgcaaattatatcaaagaaatcaaaagattacaaatcttggttgct 

85 DGTLENLINDTVFANYIKE I KRLQI L V A 

9943 gaaacacgtgctaacagtgtgaatattcttttgacaaaaaataaaccggatgttgctgatgatcgaacattttggtataagatt 

113 ETRANSVNILLTKNKPDVADDRTFWYKI 

10027 caacgcgacaatactgattatggagccgatcctattgacacgctacgtattgttgcaatcaataaagttagtggccggaatacc 

141 QRDNTDYGADPIDTLRIVAINKVSGWNT 

10111 gctacaggagatatttatcttaacattaaaggaacggagggtgtataa 10158 

169 ATGDIYLNIKGTEGV* 
182ORF012 

10872 atggcaaataaaaatattcaaatgaaggatagcaatgacaataatttatatccaagtgttcgagcagaaaacttgttagatttg 

1 MANKNIQMKDSNDNNLYPSVRAENLLDL 

10956 accagtcgtgctgaattaacaatgacaaattgtcaattatatgcagctggtgataaaacaaatgcaatctcttatctcggtgca 

29 TSRAELTMTNCQIiYAAGDKTNAISYLGA 

11040 gtaggtatgctcgaaggtatgataaagtttactgaaagtttgacaaaccctgtgatcacaacgctaccagaaggttttagacca 

57 VGMLEGMIKFTBSLTNPVITTLPEGFRP 

11124 ataagaacaaaacgtattggttgtttcgcaaaatattacacaccaaatccaacagatacaaaagaaatggtttatgtatcaatc 

85 IRTKRIGCFAKYYTPN PTDTKEMVYVSI 

11208 acacctgatggcaaagtaactgtaaatgacaatgtaggtaaaatcgaatatctatccctagataattgcgttttccctctaaaa 

113 TPDGKVTVNDNVGKI EYLSLDNCVFPLK 

11292 taa 11294 

141 * 
182ORF013 

10456 atggcagataaaaatattcaaatgcaggataaagatcataatcgtttaatgcctgttacaattgctaaaaatgttctaacaggc 

1 MADKNIQMQDKDHNRLMPVT IAKNVLTG 

10S40 gactctaatcttgaattagttaatgctgaaataagaggtaacgctagtgaagctaaaacacttgcacaacaagctaaagaaact 

29 DSNLELVNAEIRGNASEAKTLAQQAKET 

10624 gctgctggtctgtcaacagaaattgacacagtaacatcaaccgcaaatcaagcgttgacgaaggctggtacagcacaacaaacc 

57 AAGLSTE I DTVTSTANQALTKAGTAQ QT 

10708 gcagaacaagcgaaaacaacagcaaacagtatcagcgcagttgcaacggcagctaaaaacacagctgattcagcacaaaaaagt 

85 AEQAKTTAKS ISAVATAAKNTADSAQKS 

10792 gcaactgatctagctgttcgagtaagcagtttagaggacacagcaatacaatatactgtattaccatag 10860 

113 ATDLAVRVSSLE DTAIQYTVLP * 
182ORF014 

13716 atgatagaatatatcacacaatggttggcagatgataatcatcttgtttatggtttgattatatggttaatggttgcaatgatt 

1 MIEYITQWLADDNHLVYGLI IWLMVAMI 

13800 atcgattttgtgttaggttttacaattgccaaatttaacaaggaaatcgactttagtagttttaaagctaaagcaggtatcatt 

29 IDFVLGFTIAKFNKEIDFSS F KAKAGI I 

13884 . gttaaggtggcagaaatggttttagtggtttactttattcctgtagcagtaaaattcggtgcagtaggtattacaatgtatata 

57 VKVAEMVLVVYFI PVAVKFGAVGITMYI 

13968 acaatgttggttggtttgattttatcagaaatttatagtatactaggacatatttcagatatcgatgatgataataattggact 

85 TMLVGLILSEIYS ILGHI SDIDDDNNWT 

14052 gattatgttaagaagtttttagacggaacactcaacagaaaggacgatattaaatga 14108 

113 DYVKKFLDGTLNRKDD I K * 
182ORF01S 

854 atggaaatcgtaaaaagcacatttgacacacaaacaccagaaggaatgttacaagtattcaatgccacaaacggggcttcaatt 

1 MEIVKSTFDTQTPEGMLQVFNATNGAS I 

938 ccgttacgtaacgcaattggcgaagtactagaattgaaagatattctagtttactcagacgaagtttctggttttggtggagcc 

29 PLRNAIGEVLELKDILVYSDEVSGFGGA 

1022 gaaccatcacaagcagaactagtcgctttcttcacagaagatggtaaaacttatgcgggtgtatcagcagtagcaacaaaatca 

57 EPSQAELVAFFTEDGKTYAGVSAVATKS 

1106 gctaaaaacctaattgatatgatgactgctaaccctgacatcaaaccaaaaatttcttttgtcgaaggaaaatcaaacggtgga 

85 A K N L I DMMTANPD I K P K I S FV E G -K S JW- G ' 

1190 caaaaatttgtaaatctacaagtggtttcactgtag 1225 _ 

113 QKFVNLQVVSL* 
182ORF016 

17033 atgattaacaatttatcattaattttagagggtttaaatcaactaaccaaagatgacaacgatagtctagcgtctatcaagtca 

1 MINNLSLILEGLNQLTKDDNDSLASIKS 

16949 gaaataacacaaggaggaaaacaattaattttatacattgattacgttacaaaagagttcgtgttaacacatgataaatataac 
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29 eitqggkqlilyid yvtkefvlthdkyn 

16865 tatgtttatcttgatagccattgcattaatatcgcaacaacgaaatcaatgaaaagcgttgaacactatgcggaacaattgaaa 

57 YVYLDSHCINIAITKSMKSVEHYAEQLK 

16781 catgacggatataaacaaattacggacaaatag 1674 9 

85 HDGYKQITDK* 
183ORF017 

154 atgaaatattcactacaacaaatagatgaaattaaatcaacaattttcagaattagattaaaaaggcatgaactagaggaattg 

1 MKYS LQQIDEIKSTI FRIRLKRHELEEL 

238 gtggacgaagtaaacgatattgctaaagatccggaggaaagatatcttttatcgtcttattacacagaagaagaacgtttgttt 

29 VDEVNDIAKDPEERYLLSFYYTEEERLF 

322 gaaattccctctgcaagattaatagattattacaacgaaaagatcacaaatctgaaatcggaaatcatatcactcgaaaaaaga 

57 EI PSARLIDYYNEKITNLKSEI ISLEKR 

406 ttacaaaaactagtaaaataa 426 

85 L Q K h V K * 
182ORF018 

16737 atgattgcacgaacattcaaagaacaccgcgaactaattgaatggttacgtttctactgtaaacgtaacctttcagacaatgaa 

1 MIARTFKEHRELI EWLRFYCKRNLSDNE 

16653 aaaatagagatcatagaggggactttacaagatttcgacgttccggaaataaatatcaccgaacttttgttaactcattcaacg 

29 K I E I IEGTLQDFDVPEINITELLLTHST 

16569 ctattacccgaatcgagtcaacttaacattcttgaaaagtattgtcaggcaatgaaattagtaacttcatacgtaaaagttggt 

57 LLPESSQFNILBKYCQAMKLVTSYVKVG 

16485 tctcgctatcagttagcgttacaaataccaaaaggctatttaaaggaggtggaataa 16429 

85 SRYQLALQI PKGYLKEVE * 
182ORP019 

4323 aeggaaattaaagaacatgaatcaattttaaatggtattcttgaaagtgtcacagacggtgaagcaagatcaaagattgtagaa 

1 MEIKEHESILNGI LESVTDG EAR S KIVE 

4407 catcttgaagcattgcgagaagactacggagcaacaactgaagctttgacatcagcaaatagcacacttgaaaagttaaagaaa 

29 HLEALREDYGATT EALTSANS TLEKLKK 

4491 gataacgaagcgttggttatttcaaactcaaaattgttccgagaacgagcgatcgtagaaccagcagaaaataacgaaccagaa 

57 DNEALVISNSKLFRERAIVE PAENNEPE 

4575 acagaccagaatattacactagacgatttaggaatttaa 4613 

85 TDQNITLDDLGI* 
182ORP020 

10158 atggcagacattagaacacaactaacaagtgaagatggatcagacaatttatttccaatttcaaaagccgttaatattatgact 

1 MADI RTQLTSEDGSDNLFPI SKAVNIMT 

10242 aatagcggtacgaatgtagaaggagaattgggtacactcaaacaaaatgacgaaacaatgaatacctcagttcaaaatgctgta 

29 NSGTNVBGE>LGTLKQNDETMNTSVQNAV 

10326 gttactgccaatcaagcaaaagattctgtagctgaattaaatgtaaatgttggtaaactaaccaatcgaataacaacattagag 

57 VTANQAKDSVAELNVNVGKLTNRITTIiE 

10410 agtacagtggctaatcttgatggtattcgttatgtagaggtgtaa 10454 

85 STVANLDGI RYVEV* 
182ORF021 

17339 atgaacaataaatcattaatagctgaaaaaggagaggtatctctacttcaccccettaatgagtgggatatgaattatcatatc 

1 MNNKSLIAEKGEVSLLHPFNE WDMNYHI 

17255 atagataccgaaaacaataaacattatcttattgatattgatgaggtaggcgatgaggaatattgtttgttatcttttgaagaa 

29 IDTENNKHYLIDIDEVGDEEYCLLSFEE 

17171 ctaaaggaattagatatggatcttatttccgagtattcatggaaaactacagaaataacatattaa 17106 

57 LKELDMDLISEYSWKTTEITY * 
1820RF022 

12868 gtgggttgtctaatgctaaagctgaaacgttggaaggtcaagcagagatcatcgctcaaggggataaaacaggtcaatggatgg 

1 VGCLMLKLKRWKVKQRSSLKG I KQVNGW 

12952 ataatacacctgtttcttctgcaggttatactaaccctcagaccctttcagcatttaaacaatctgcaaatattgatgttgcta 

29 IIHLFLLQVILTLRPFQHLNNLQILMLL 

13036 caattaattttatgtgtcactgggaacgccctggtaaacttcatatcgaagaaagacttgatcttgcacaagcttatagtaagc 

57 QLILCVTGNALVNFISKKDLI LHKLIVS 

13120 atattgacggtagcggtggcggtggcgtaa 13149 

85 ILTVAVAVA* 
182ORF023 

12189 atggttgttgttttggacatgcaagttatgcgcatatcctcgataataacttacgccaccttcgattgtgttaccagaaatttc 

1 MVVVLDMQVMRISSI ITYATFDCVTRNF 

12105 acagaaattaattacattctgataatcatcgtcactgtcgataatgatcgctgtacaaaaatgaatacggttgtttttcacaaa 

29 TEINYILI I IVIVDNDRCT KMNTVVFHK 

12021 gaaacctctaaaacctgtacccccagtattgatatcgttcccttgccacataccatttacatcgggaaaagctgttttgataat 

57 ETSKTCTPSIDIVPLPHTI YIGKSCFDN 

11937 tgcttgagagatattagagaatag 11914 

85 CLRDIRE* 
182ORF024 

6174 atgcttgtaactatctcatctttaaaaacgaagaaacctatcctagtaaacggcagtatgcctttgttactgatattgaatata 

1 MLVTISSLKTKKLILVNGSMPLL-L I_I»*N ~I 

6258 agaatgacaacacaagtttcgttacctttgaaattgatgttttacaaacttatcgtttcgatattggtatacgagaaagtttca 

29 RMTTQVSLPLKLMFYKLIVS I LVYEKVS 

6342 ttgcaaaagaacaccctcaactttattattcgaatggaatacctttcattaatacaattgaagagtcgcttgattacggtagag 

57 LQKNTLNFI IRMEY LSLIQLKSRLITVE 

6426 aatacacaacaacaaatgtaa 6446 

85 NTQQQM* 
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1S2ORP025 

548 atgggtcgaaaactaatgcaacgaaacgtaacatcaactaaagtagaattctcagaagttatcgtacaagatggagcgccaaca 

1 MGRKLMQRNVTSTKVEFSEVIVQDGAPT 

632 attgtaccatgcgaaccagttgtcttaacaggaaaactttcagaagaaaaagctttatcagcgatcaaacgtaaaaaccctgat 

29 IVPCEPVVLTGKLSEEKALSAIKRKNPD 

716 aaaaacgtagttgtaacaaatgtttcacatgaaacagcgctttacacaatgccagtcgataaatttatcgagttagcagacaaa 

57 KNVVVTNVSHETALYTMPVDKFIELADK 

800 tcaacacaagcctaa 814 

85 S T Q A * 
182ORP026 

13259 atggaaattatttggtctgccgtttcctgcatgcgtgccaaaaagttgtccactcatgaaacttttaggatcaagatttgtatt 

1 ME I IWSAVSCMRAKKLSTHETFRIKI CI 

13175 cttgattggggttccatagcaacgttttacgccaccgccaccgctaccgtcaatatgcttactataagcttgtgcaagatcaag 

29 LDWGSIATFYATATATVNMLTISLCKIK 

13091 tctttcttcgatatgaagtttaccagggcgttcccagtgacacataaaattaattgtagcaacatcaatatttgcagattgttt 

57 SFFDMKFTRAFPVTHKINCSNINICRLF 

13007 aaatgctga 12999 

85 K C * 
182ORT027 

14896 atgaacatgattgtatgttcctctaatatgatcgagttgtgttggatcagacaattcaaagatcgtttctccgtcaaaataaaa 

1 MNMIVCSSNMIELCWIRQFKDRFSVKIK 

14812 cacgcttgtgctacctgttgtaagagtgaaaataaactcatcacttgtggcgtcaaatattttattttcaggaggaacaatttg 

29 HACATCCKSENKLITCGVKYFIFRRNNL 

14728 atcctgttttttacctacatacagatcccatgtattaccaccgccatagaaaacattcaaatcaagattgccgttgtatcctgg 

57 ILFFTYIQIPCITIAIENIQIKIAVVSW 

14644 taa 14642 

85 * 
182ORF028 

14430 atgtttataataaaacaggcgttaaagcatggttttatacgtatacagcaaacctcaatacaactgatttttctagtattgcaa 

1 MFIIKQALKHGFIRIQQTSIQLIFLVLQ 

14514 aaggcgattatggtttatgggttgctgaatatggatcaaatcaaccacaaggctactctcaaccagcgccacctaaaacaaata 

29 K A I MVYGLLNMDQINHKATLNQRHLKQ I 

14598 attttccaattgttgcctgttttcagtttacaagtaaaggacgtttaccaggatacaacggcaatcttgatttga 14672 

57 I FQliLPVFSLQVKDVYQDTTAILI* 
182ORF029 

17606 atgaatgaaccgatcgtatacacagaaatttattcaaataacgtggtatgtatgaaaatttttagagatgaggataaacttagt 

1 MNE P I VYTE IYSNNVVCMK I FRDED KL.S 

17522 aaattcctctatttagaatttgaggtggatgaggctaaaaagt tact tgaaaataaaacaattt cat ttgatgataactggact 

29 KFLYLEFEVDEAKKLLENKTISFDDNWT 

17438 ttctcaataaattatccagaatattaa 17412 

57 FSINYPBY* 
182ORF030 

16429 atggctacattctacaaggaaccaatatacgatatcacagtattttatatagatggttgggaggttttgatacacaaaaccgaa 

1 MATFYKEPIYDITVFYIDGWEVLIHKTE 

16345 cctctcaccttaacaaaagcattaaaatatagccgtatatacctagaaatggatatagtgaattgcgttagaatagaaagaaat 

29 PLTLTKALKYSRIYLEMDIVNCVRI ERN 

16261 ggacgtcctatagctacattttacagggaattattaaaactgtataaggagaaagaactatga 16199 

57 GRPIATFYRELLKLYKEKEL* 



182ORF031 

8603 atgttacctgaactttcaatctgttcattgttagataagacttctgatgtttgtacacgtgcagtcttatctacgttagcattg 

1 MLPELSICSLLDKTSDVCTRAVLSTLAL 

8519 ttgatacctagaaaagttaacacttcattccatacttcgttcaattctgatcgtagtttatctactacatatggagcatttgtt 

29 LIPRKVMTSFHTSFNSDRSLSTTYGAFV 

8435 tgccatacattaaaagattcgtcaaactccatatctttatccacaaaaacagcctga 8379 

57 CHTLKDSSNSISLSTKTA* 
182ORF032 

11413 atgtttcatcaaaaacaacttgtttcgggttcgtttcagggtgcaataggtaattcttttgattttcttctttcatcttgttca 

1 MFHQKQLVSGSFQGAIGNSFDFLLSSCS 

11329 tttgaatatcaattcgttcttccatatgaacctccttattttagagggaaaacgcaattatctagggatagatattcgatttta 

29 FEYQFVLPYEPPYFRGKTQLSRDRYSIL 

11245 cctacattgtcatttacagttactttgccatcaggtgtgattgatacataa 11195 

57 PTLSFTVTLPSGVIDT* 
182ORP033 

4 942 atgtcaacaaaaatttcttcaatcgttcgacctaaaggcatgtttccttttttaaacattttcaaagggttacgccaagatttg 

1 MSTKISSIVRPKGMFPFLNIFKGLRQDL 

4858 tatcggataactactttaccaatacggtcaactaaagttgaaataaattcgttttttactacgtctaaacgtgtgatccctgca 

29 YRITTLP IRS TKVEINSFFTTSKR V I P A ^ 

4774 ccaaccgcttcgatgt tat ctgcatttggcataggtacgttcgcctga 4727 - Li. ~* 

57 PTASMLSAFGIGTFA* ^ 
182ORF034 

6160 gtgtttatctactctaaaaactcccccgagttgtgtatccctttgataagaacaatctctattctcgttaagaacaggaaacga 

1 VFIYSKNSPELCI PLIRTISILVKNRKR 

6076 attaaagtacgattcctgttcctgttgagttttaaaccatcttgtgtgtgtataggtgttatcaaaaggcacgttagccaacaa 

29 IKVRFLFLLSFKPSCVCIGVIKRHVSQQ 

5992 ttttacatttgtataccttcttgccataattgtcctccttag 5951 
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57 FYICIPSCHNCPP** 
182ORF035 

157S8 atggcgcataagaaactactatttttacttctcttttcaataaacgtatcactatcactgacaaactcatcgctgacaccaaaa 

1 MAHKKLLFL LLFSINVSLSLTNSLLILK 

15674 tcttcgtattctgttccacgaatcaatctaccaaaaggtgtttctctcttcacttctgcaaagtcttttgaatcacacaattca 

29 SSYSVPRINLPKGVSLFTSAKSFESHNS 

15590 atcaatatacctcgatcctga 15570 

57 INIPRS* 
182OR7036 

2315 atgtctgtgctgccttgcattttacaccactcaaaaaaagaatcgatttctaaaccgaacgtcatattgtcaacgttgtctata 

1 MSVLPCILHHSKKESISKPNVILSTLSI 

2231 tcgcatacgccccacgaccatacacgacaatcgttgagatcagttgttgtttcaaagtcgccagtatatttcttaatcataatt 

29 SHTPHDHTRQSLRSVVVSKSPVYFLIII 

2147 cttctcctgtttctgaattaa 2127 

57 LLLFLN* 
182ORF037 

12280 gtgagttacgacaataaacatctacatcaatataagcttgatccacatcttgaaactcaaacaaagcgtttctatttccgtatg 

1 VSYDNKHLHQYKLDPHLETQTKRFYFRM 

12196 ctagaaaatggttgttgttttggacatgcaagttatgcgcatatcctcgataataacttacgccaccttcgattgtgttaccag 

29 LENGCCFGHASYAHI LDNNLRHLRLCYQ 

12112 aaatttcacagaaattaa 12095 

57 K F H R N * 
182ORF038 

14769 gcgatgagtttattttcactcttacaacaggtagcacaagcgtgttttattttgacggagaaacgatctttgaattgtctgatc 

1 VMSLFSLLQQVAQACFILTEKRSLNCLI 

14853 caacacaactcgatcatattagaggaacatacaaccacgttcatggaaaagaaatcccatcaatggtgtggacacctgaacaat 

29 QHNSIILEEHTIMFMEKKSHQWCGHLNN 

14937 ttgatatttacttaa 14951 

57 LIFT* 
182ORF039 

9992 atgttgctgatgatcgaacattttggtataagattcaacgcgacaatactgattatggagccgatcctattgacacgttacgta 

1 MLLMIEHFGIRFNATILIME PILLTRYV 

10076 ttgttgcaatcaataaagttagtggctggaataccgctacaggagatatttatcttaacactaaaggaacggagggtgtataat 

29 LLQSIKLVAGIPLQEIFILTLKERRVYN 

10160 ggcagacattag 10171 

57 G R H * 
182ORF040 

16202 atgagaaaagatttcgtctacattaacacacccgatccaaaagcaaacaaaaaggcgttagcaaaaatcactaacgccaaagaa 

1 MRKDFVYINTPDPKANKKALAKITNAKE 

16118 ccaaaacaaaactatcgcagactacaattactatgttatctactat teat cattgtaatagaactaatcgtggtagctct acta 

29 PKQNYRRLQLLCYLLFI IVI ELIVVALL 

16034 aaatag 16029 

57 K * 

182ORP041 

3886 atggaactatataaagcaatgtttatcgtacgtgatgaaggtactattgacggttacgatactgaacactatgtagatacttct 

1 MELYKAMFIVRDEGTIDGYDTEHYVDIS 

3970 t tacatgact t tgaagaaatatatggaaaagaaacacgtgaaattgaagcagtaacat tagt aaaaacaggaaattt aaaaaaa 

29 LHDFEEIYGKETREIEAVTLVKTGNLKK 

4054 taa 4056 

57 * 
182ORP042 

10832 gtgtcctctaaactgcttactcgaacagctagatcagttgcactttcttgtgctgaatcagctgtgtttttagctgccgttgca 

1 VSSKLLTRTARSVALFCAESAVFLAAVA 

10748 actgcgctgatactgtttgctgttgttttcgcttgttctgcggtttgttgtgctgtaccagccttcgtcaacgcttga 10671 

29 TALI LFAVVFACSAVCCAVPAFVNA* 
182ORF043 

10652 gtgtcaatttctgttgacaaaccagcagcagtttctttagcttgttgtgcaagtgttttagcttcactagcgttacctcttatt 

1 VSISVDKPAAVSLACCASVLASLALPLI 

10568 tcagcattaactaattcaagattagagtcgcctgttagaacatttttagcaattgtaacaggcattaaacgattatga 10491 

29 SALTNSRLESPVRTFLAIVTG I KRL* 
182ORF044 

6457 acgaaaagttgttacatttgttgttgtgtattctctaccgtaatcaagcgactcttcaattgtattaatgaaaggtactccatt 

1 MKSCYICCCVFSTVIKRLFNCINERYSI 

6373 cgaataataaagttgagggtgttcttttgcaatgaaactttctcgtataccaatatcgaaacgataagtttgtaa 6299 

29 RIIKLRVFFCNETFSYTNIETISL* 
182ORF045 

6729 atgaatggtatacctgtatacgacgtcacatacatcccgactatcttatttaaaaaaggttctttcgttgtaagaaacgccdtg 

1 MNG I PVYDVT YI PTI LFKKG S F V ~V R jf. V M 

6645 tactctccaaaattagcattgcctgccccatttggcttgtatacctccccacttgaattgataggaagtaaataa 6571 

29 YSPKLALPAPFGLYTSPLELIGSK* 
182ORF046 

2372 atggtttcaaatggtgtaaagaagcaaaagaagatcgaacattctccacactcatatcaaatatgggtcaatggtatgctttgg 

1 MVSNGVKKQKKIEHSPHSYQIWVNGMLW 

2456 aaatttgttgggaagtcaattacacaacaacaaaatcaggtaaaacgaaaaaagagaaatctcgaacaacaa 2527 

29 KFVGKLI TQQQNQVKRKKRNLEQ* 
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182ORP047 

13353 atgctcccattgttccaacatgtgttaccgtcccatcgcaacacgcaatcatctcattgccagggtgaccaattgaaccaaagc 

1 MLPLFQHVLLFHRNMQSFHCQGDQLNQS 

13269 ccaaaccatcatggaaattatttggtctgccgtttcctgcacgcgtgccaaaaagctgtccactcatga 13201 

29 PNHHGNYLVCRFLHACQKVVHS * 
182ORF048 

3395 atgtcagggtttgttccgaactttccatacaagctatttaacatacctttggcgttagcttttctagccccttcggtggtgttc 

1 MSGFVPNFPYKliFNI PLALAFLAPSVVF 

3311 tttacttcgatccatttatcgatccagcctttgaacatatcacaagaagctttgaacatatatccgtaa 3243 

29 FTSIHLSI QPLNISQEALNIYP* 
182ORF049 

1578 atgttgcaatctcaagagcgcaaatcaaagaagcgcaaattaaaacagagcaagctcaaaaagcgaaagaagaacactacaaag 

1 MLQSQERKSKKRKLKQSKLKKRKKNTTK 

1662 agcttaacaaagttgaagctaagaagcccacagaaaacacaaccgtcacaccaactattttaa 1724 

29 SLTKLKLRSPQKTQLSHQLF* 
182ORF050 

8012 atggttatcttggtttctttaaagaccctacacttgggttcatggtttgcgcaggggcagaagatggtcaaatcgatcattatc 

1 MVI LVSLKTLHLGSWFAQGQKMVKS I II 

8096 acaaccctattttctttacagcaaacgaagcaatgtatcacaagagatatcctgttttaa 8155 

29 TTLFSLQQTKQCITRDILF* 
182ORF051 

9390 atgcttctgaaaaagaaacaaagaacacagacattaataaagatcaaaatcaaaccaaagatacgattacacgatataaaggta 

1 MLLKKKQRTQTLIKIKIKPKIRLHDIKV 

9474 aaaagggaaacactgattatgctgacttactcgaaaaatatcgtagaagtgttttga 9530 

29 KRETLIMLTYSKNIVEVF* 
182ORF052 

4096 gtgatagttgacaagagtcaaatttggcgagattgggcgaatgtacacgtgaaatatcgtgcgctcccgctaagttatggacac 

1 VIVDKSQIWRDWANVHVKYRALPLSYGH 

4180 ataaacgttttgaccgecaaccaatcgcaaaaaccttttaggagtagcccttaa 4233 

29 INVLTVNQSQKPFRSSP* 
182ORF053 

15656 gtggaacagaatacgaagattttagtaccaacaatgagtttgtcaatgatagtgatacgtttattgaaaagagaagtaaaaata 

1 VEQNTKILVSTMSLSMIVI RLLKREVKI 

15740 gtagtttcttatgcgccattgcttttgaagggaaaatctttgggtattggatag 15793 

29 VVSYAPLLLKGKSLGIG* 



182ORF054 

8136 gtgatacattgcttcgtttgctgtaaagaaaatagggttgtgataatgatcgatttgaccatcttctgcccctgcgcaaaccat 
1 VIHCFVCCKENRVVIMIDLTIFCPCANH 
8052 gaacccaagtgtagggtctttaaagaaaccaagataaccattagtgtgtaa 8002 

29 EPKCRVFKETKITISV* 
182ORF055 

8324 atgaaaagaaatacttctcattgctacaagcttataaccaaattgacgaaaataatcaggctgtttttgtggataaagatatgg 
1 MKRNTSHCYKLITKLTKI I RLFLW I KIW 

8408 agtttgacgaatcttttaatgtatggcaaacaaatgctccatatgtag 8455 

29 SLTNLLMYGKQMLHM* 
182ORF056 

6549 gtggcccatctcctttttcctactatttacttcctatcaattcaagtggggaggtatacaaaccaaatggggcaggcaatgcta 
1 VAHLLFPI IY-FLSIQVORYTNQMGQAML 

6633 attttggagagtacatggcgtttcttacaacgaaagaaccttttttaa 6680 

29 ILESTWRFLQRKNLF* 
182ORT057 

8264 atgtccgccatatctaaagcaaaacgatgtaaacttggtaacgtaggaactttcaagtcattattatacaacatgatacatttt 
1 MSAISKAKRCKLGNVGTFKSLLYNMIHF 
8180 gatttatcatcatcatcatcatatcttaaaacaggatatctcttgtga 8133 

29 DLSSSSSYLKTGYLL* 
182ORF058 

5176 gtgtattcaaattcgcttacttcgtcacctgtgtataaagcgttcattacaccagcaacgaaaccattgaaattatcccatgaa 
1 VYSNSLTSSPVYKAFITPATKLLKLSHE 
5092 gtaaatgctttttctaaccatgcttcttggatcgtttgtttgtag 5048 

29 VNAFSNHASWIVCL* 
182ORF059 

15876 atggtcttccgtagtcattgcataaaaatgatttgtatttggttgataatcataactcacatagacacaacctgtttcagcgtc 

1 MVFRSHCIKMICIWLIIITHIDTTCFSV 

15792 tatccaatacccaaagattttcccttcaaaagcaatggcgcataa 15748 .„ 

29 YPI PKDFPFKSNGA* - — 

182ORF060 - 

15404 gtgatttttgatttctcaattaaaaactcatcaaacaaaattgtacgaacttcgggatattcattagatttttcaattccccac 

1 VI FDFSI KNSSNKI VRTSGY5LDFS I PH 

15320 gtactaagtggaacagcccaacccattaatttatcatcacaatag 15276 

29 VLSGTAQPINLSSQ* 

182ORF061 

2102 atgaggggacttctccacctgtttcagactcgatcacttttgcaatcttactgtaaacttgttcttttttctgttgtacttctg 
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1 MRGLLHLFQTRSLLQSYCKLVLFSVVLL 
20X8 cttcgtcataaatgtagtcaaggttcatgcctaagaagttactaa 1974 

29 LRHKCSQGSCLRSY* 
182ORF062 

1992 atgtctaagaagttactaacatatgttttcataaatagatcaagccccattgagtcaagtaaacgaacaatatcgtcagcgtct 
1 MSKKLLTYVFINRSS PIESSKRTISSAS 

1908 gaattgaaaatagtaaacatcgcttgtctgaaattgtcgtaa 1867 

29 ELKIVNIACLKLS* 
182ORF063 

14306 gtgtaccttctaaacccctctcatgcgcaaaatgatacacaccaatctttttacctaaagacaaagcttgttgaaatgctcggt 
1 VYLLNPSHAQ NDTHQSFYLKTKLVEMLG 

14222 cacaatcagggtttacataacctgttccgcctgttgctttaa 14181 
29 HNQGLHNLFRLLL* 
182OR7064 

7356 atgatgttagtcaaaccaacaaaagggttgttacttgctaaggctgaaaagatcgctcctcctgtactcattgcactgtttccc 
1 MMLVKPTKGLLLAKAEKIAPPVLIALFP 
7272 ataccatgtctgaaagtattgcgaatgttttgctctcga 7234 

29 IPCLKVLRMFCS* 
182ORF065 

3582 atgaatgctatctgtatcacaataaataatgcgatcaaaacatttttgagcggttgtaatggtagtatatctaccccaagccgt 
1 MNAICITINNAIKTFLSGCNGSISTPSR 
3498 cacaaaactagcaagcggaacataaacaggatctcttaa 3460 

29 HKTSKRNINRIS* 
182ORF066 

4234 atgtggctactcttttttgtgtttcacagaattatgtttcacgtgaaacagtttttatggtataatagaatcaaaaggaggtgg 
1 MWLLFFVFHRIMFHVKQFLWYNRIKRRW 
4318 agattatggaaattaaagaacatgaatcaattttaa 4353 

29 RLWKLKNMNQF* 
182ORP067 

13882 atgatacct get ttagctttaaaac tact aaagtcgacttccttgttaaatttggcaattgtaaaacctaacacaaaatcgata 
1 MIPALALKLLKSISLLNLAIVKPNTKSI 

13798 atcattgcaaccattaaccatataatcaaaccataa 13763 

29 IIATINHIIKP* 
182ORF068 

7267 atgtctgaaagtattgcgaatgttttgctcttgagcaatcaaggagtttttgtttccttgcatgaatgcagaagcatagtcaga 

1 MSESIANVLLLSNQGVFVSLHECRSIVR 

7183 tttaactcctacatcgttaggatcattatcgattaa 7148 

29 FNSYIVRIIID* 
182ORF069 

5027 gtggaacaatgtttttacatcgggaacttcctgtttaaatacccctgtaacagactcgtcagggttgaacttatgttcctgtgc 

1 VBQCFYIGNFLFKYPCNRLVRVELMFLC 

4943 aatgtcaacaaaaatttcttcaatcgttcgacctaa 4908 

29 NVNKNFFMRST* 
182ORF070 

1031 gtgatggttcggctccaccaaaaccagaaacttcgtctgagtaaactagaatatctttcaattctagtacttcgccaattgcgt 

1 VMVRLHQNQKLRLSKLEYLSILVLRQLR 

947 tacgtaacggaattgaagccccgtttgtggcattga 912 

29 YVTELKPRLWH* 
182ORF071 

11741 atggttttgcattatggttgccacaaggcgctcaaagtggtaaaggaattttctttaatgatactcgcaattacaatcgttttg 

1 MVLHYGCHKALKVVKEFSLMILAITIVL 

11825 actttgatttgtttgttcgtaactgtactttaa 11857 

29 TLICLFVTVL* 
182ORF072 

11723 atgtttacattaaatgccgtcattgtttcaaactttaatgtcgtttctcccgatcctaagaaagtaactacaggtacatcacgt 

1 MFTLNAVIVSNFNVVSPDPKKVTTGTSR 

11639 ttcaattcaatggtgttagcaaagcgataa 11610 

29 FNSMVLAKR * 
182ORF073 

2876 gtgaagccgcctttgtatgctttacgtaagtctttatcaaaccctaaagacaaaataggaaaccattgtttgaaagttgatttt 

1 VKPPLYALRKSLSNPKDKIGNHCLKVDF 

2792 ccatgtgtagctttcagccaatctttgtaa 2763 

29 PCVAFSQSL* 
182ORF074 

8923 gtgattgataaattttgtttcaaattctgctcgttttgtttcgtcataaaacggataatcaaaatcaaacaattgttttcggcc 

1 VIDKFCFKFCSFCFVIKRI IKIKQLFSA 

8839 aacttcaatacgttcttttcgagataa 8813 ... _ . 

29 NFNTFFSR* " ** 

182ORF075 - 

7463 gtgttacattatctggaatattttcgatatctgccactttacctgccaagaggttcaaaccgttttctttttcagaaacatagt 

1 VLHYLEYFRYLPLYLPRGSNRFLFQKHS 

7379 tgtttacttgttgtcctgctcccatga 7353 

29 CLLVVLLP* 

182ORF076 

2426 atgagtgtggagaatgttcgatcttcttttgcttctttacaccatttgaaaccatttttgaataaccatgaaagcataaactct 
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1 MSVENVRSSFASLHHLKPFLNNHES INS 

2342 ccgtcaaatttttcgttgtggaaataa 2316 

29 PSNFSLWK* 
182ORF077 

11858 atgaaggaacgtatgttgttgttgctagaggtagaggggttacatttgaaaattgtctattctctaatatctctcaagcaatta 

1 MKERMLLLLEVEGLHLKIVYSLISLKQL 

11942 tcaaaacagcttttcccgatgtaa 11965 

29 SKQLFPM* 
182ORF078 

7671 gtgcctacaatatttggttcttttaatttaatgaaattccatgcttttcttgtttgtaagtttggtgtagctactcgattgctc 

1 VPTI FGSFNLMKFHAFLVCKFGVATRLL 

7587 tttgtgccatacattgagaagtaa 7564 

29 FVPYIEK* 
182ORF079 

7488 gtgaaagataagtttgatccaagctgtgttacattatctggaatattttcgatatctgccactttacctgccaagaggttcaaa 

1 VKDKFDPSCVTLSGI FSISATLPAKRFK 

7404 ccgttttctttttcagaaacatag 7381 

29 PFSFSBT* 
182ORF080 

4473 gtgtgctatttgctgatgtcaaagctccagttgttgctccgtagtctcctcgcaatgcttcaagatgttctacaatctttgatc 

1 VCYLLMSKLQLLLRS LLAMLQDVLQSLI 

4389 ttgcttcaccgtctgtga 4372 

29 L L H R L * 
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Table 24 



Sequence similarities phage 182 and public databases 

Phage: 182 
Database: nr 

Query- sid| 110156 1 lan 1 182ORF001 Phage 182 ORF| 5966-7780| 2 
(604 letters) 

gi|l38124|sp|P07534|VG9 BPPZA TAIL PROTEIN (LATE PROTEIN GP9) >. . . 384 e-105 

gi| 138123 |sp|P0433ljvG9~BPPH2 TAIL PROTEIN (LATE PROTEIN GP9) >... 374 e-103 

gij 1429238 |gnl|PID|ell73412 (X99260) tail protein [Bacteriophag. . . 346 3e-94 

gi | 215339 (M12456) p9 tail protein [Bacteriophage phi -291 >gi|2... 208 8e-53 

gij 1181970 |gnl|PID|e221269 (Z47794) tail protein {Bacteriophage... 62 8e-09 

gi j 1181968 jgnl jpiD|e221267 (Z47794) tail protein [Bacteriophage... 56 6e-07 

gi|2500030|sp|Q59968|CARA SULSO CARBAMOYL- PHOSPHATE SYNTHASE SM. . . 49 8e-05 



Query- sid| 110157 | lan| 182ORF002 Phage 182 ORF| 2152-3873 | 1 
(573 letters) 

gi|ll8848|sp|P19894|DPOL_BPM2 DNA POLYMERASE >gi | 76896 |pir| |JQ0 .. . 
gij 1429230 jgnl | PID|ell73404 (X99260) DNA polymerase (Bacterioph. . . 
gij 118849 |sp|P03680|DPOL_BPPH2 DNA POLYMERASE (EARLY PROTEIN GP... 
gijll88Sl jspjp0695ojoPOL~BPPZA DNA POLYMERASE (EARLY PROTEIN GP... 
gij 15732 (X53371) DNA polymerase (AA 1-575) [Bacteriophage phi-29) 
gij 15734 (XS3370) DNA polymerase (AA 1-575) (Bacteriophage phi-29) 
gijl572479|gnl|PID|e242301 (X96987) DNA polymerase {Bacteriopha . 
gi| 1072656 jpir j [S51275 DNA polymerase - phage CP-1 >gi | 836593 |g . 
gijll8847|sp|P22374 |DPOMJVSCIM PROBABLE DNA POLYMERASE >gi|8385. 
gij 461962 j spjp33537jDPOM^NEUCR PROBABLE DNA POLYMERASE >gi|2833. 
gi j 461963 j sp jp33538 jDPOMJIEUIN PROBABLE DNA POLYMERASE >gi|l018. 
gij 1084487 1 pir | (S41618 DNA polymerase - slime mold (Physarum po. 
gi | 2435429 (AF012250) unassigned reading frame (possible DNA po. 
gij578157|gnl|PID|e246743 (X52106) DNA polymerase [Neurospora i. 
gij2147969|pir| |S72369 probable DNA-polyraerase - Gelasinospora . 
gij2147968 jpirj js62752 probable DNA- polymerase - Gelasinospora . 
gij 351114 0 (AF061244) B type DNA polymerase' (Agrocybe aegerita] 
gij 118850 1 sp | P10479 | DPOLJBPPRD DNA POLYMERASE (PROTEIN PI) >gi | . 
gij 578144 (X63909) putative DNA- polymerase , B-type [Morchella c. 
gij 232013 |sp|P30322|DPOM_AGABT PROBABLE DNA POLYMERASE >gi|3208. 

Query= sid| 110159 | lan| 182ORF004 Phage 182 ORF 1 4626-5954 | 3 
(442 letters) 

gi|l38117|sp|P13849|VGB_BPPH2 MAJOR HEAD PROTEIN (LATE PROTEIN . 
gij 138118 jspjP07531 j VG8~BPPZA MAJOR HEAD PROTEIN (LATE PROTEIN . 
gijl429236|gnl|PID|ell73410 (X99260) major head protein [Bacter. 
gij 1181958 jgnljpiD | e221257 (Z47794) major head protein [Bacteri. 

Query* sid| 110160 | lan| 182ORFC05 Phage 182 0RF| 12651-13700 | 3 
(349 letters) 

gi| 137932 | sp|P15132 |VG13_BPPH2 MORPHOGENESIS PROTEIN 1 (LATE PR. 
gijl429242|gnl|PID|ell73416 (X99260) morphogenesis protein (Bac. 
gi|l37933|sp|P07538|VG13_BPPZA MORPHOGENESIS PROTEIN 1 (LATE PR. 

Query= sid| 110161 1 lan| 182ORF006 Phage 182 ORF 1 14995-16026 1 1 
(343 letters) 

gi|l37944|sp|P11014|VG16_BPPH2 ENCAPS I DATI ON PROTEIN (LATE PROT. 
gij 137945 jsp | P07541 j VG16_BPPZA ENCAPS I DAT I ON PROTEIN (LATE PROT. 
gij 1429245 |gnl | PID|ell73419 (X99260) encapsidation protein [Bac. 
gijll81972 jgnl j PID) e221271 (Z47794) encapsidation protein (Bact. 



665 
657 
654 
654 
651 
651 
565 
301 
71 
65 
62 
61 
61 
59 
58 
58 
57 
56 
47 
46 



309 
305 
300 
152 



52 
48 
47 



402 
402 
381 
159 



0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

e-160 

le-80 

3e-ll 

le-09 

le-08 

3e-08 

3e-08 

le-07 

2e-07 

2e-07 

3e-07 

6e-07 

3e-04 

6e-04 



2e-83 
3e-82 
le-80 
6e-36 



8e-06 
7e-05 
2e-04 



e-111 
e-111 
e-105- 
2e-38 



Query« sid| 110162 1 lan 1 182ORF007 Phage 182 ORF| 7795-8775 1 1 
(326 letters) 
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gi|l429239|gnl!PIDjell73413 (X99260) upper collar protein [Bact... 271 5e-72 

gi j 137915 | sp| P0753 5 | VG10_BPPZA UPPER COLLAR PROTEIN (CONNECTOR ... 256 le-67 

gi|l37914 jsp|P04332|VG10_BPPH2 UPPER COLLAR PROTEIN (CONNECTOR ... 256 2e-67 

gijll81960|gnl|PID|e221259 (Z47794) connector protein (Bacterio. . . 148 6e-35 

Query- sid| 110163 | lan | 182ORF008 Phage 182 0RF| 14105-14983 | 2 
(292 letters) 



gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 

gi! 

gi 
gi 



4210750 |gnl I PID|el374037 (AJ132604) LyaL protein [Lactococcu. . . 139 2e-32 

462559|sp|P34020| LYC_CLOAB AUTOLYTIC LYSOZYME (1 , 4 -BETA-N-AC. . . 75 8e-13 

2327014 (U82823) putative lysozyme [Saccharopolyspora erythr. . . 64 2e-09 

126652|sp|P25310|LYCM STRGL LYSOZYME Ml PRECURSOR (1, 4 -BETA- . . . 60 2e-08 

127789 1 sp | P19386 | LYCA"BPCP9 LYSOZYME (ENDOLYSIN) (MURAMIDASE. . . 60 2e-08 

67761 | pir j |MUBPCP N-acetylmurainoyl-L- alanine amidase (EC 3.5... 59 3e-08 

4105636 (AF049087) lys [Leuconostoc oenos bacteriophage 10MC] 59 3e-08 

623064 (L024 96) mur amidase ; rauramidase [Bacteriophage LL-H] 57 le-07 

127787 |sp|P15057|LYCA_BPCPl LYSOZYME (ENDOLYSIN) (MURAMIDASE . . . 57 2e-07 

126597 | sp | POO 721 1 LYCH~CHASP N, O-DIACETYLMURAMIDASE (LYSOZYME. . . 57 2e-07 

127788|sp|P19385|LYCA_BPCP7 LYSOZYME (ENDOLYSIN) (MURAMIDASE. . . 57 2e-07 

67762 | pir | |MUBPC7 N-acetylmuramoyl-L- alanine amidase (EC 3.5... 56 3e-07 

3025168 |spjp76421|YEGX_ECOLI HYPOTHETICAL 32.0 KD PROTEIN IN... 53 2e-06 

4204413 (AF047001) Lys44 [Oenococcus oeni temperate bacterio. . . 53 3e-06 

2116978|gnl|PID|dl020940 (D88151) cortical fragment -lytic en... 52 5e-06 

2392844 (AF011378) lysin (Bacteriophage ski) 48 8e-05 



Query= sid| 110164 | lan | 182ORF00 9 Phage 182 ORF | 8765-9601 | 2 
(278 letters) 

gi|1429240|gnl|PID|ell73414 (X99260) lower collar protein (Bact... 180 le-44 

gi|l3792l|sp|P04333|VGll_BPPH2 LOWER COLLAR PROTEIN (LATE PROTE... 171 5e-42 

gi j 215341 (M12456) pll lower collar protein [Bacteriophage phi-29J 98 9e-20 

gi|224162|prf| | 101123 2B protein pll, lower collar [Bacteriophage... 97 le-19 

gij 535260 (Z30339) STARP antigen [Plasmodium reichenowi] 50 le-05 

gi|4049753 (AF063866) ORF MSV230 hypothetical protein [Melanopl... 49 4e-05 

gi|2131557|pir| |S70306 hypothetical protein YEL077c - yeast (Sa. . . 48 5e-05 

gij 131782 |sp|P12753|RA50_YEAST DNA REPAIR PROTEIN RAD50 (153 KD. . . 48 7e-05 

gi|2131309|pir| (S70305 hypothetical protein YBL113c - yeast (Sa. . . 47 2e-04 

gij 499325 (Z26314) STARP antigen [Plasmodium falciparum] 46 3e-04 

gi|3845171 (AB001391) ribosome releasing factor (OO, TP) [Plasm... 46 3e-04 

gi j 731903 | Sp | P40434 | YIR7_YEAST HYPOTHETICAL 197.5 KD PROTEIN IN... 4 5 5e-04 

gi|1632829|gnl|PID|e276379 (Y08924) AARP2 protein (Plasmodium f .. . 45 5e-04 

gi|ll76490|sp|P40889|YJW5_YEAST HYPOTHETICAL 197.6 KD PROTEIN I... 45 5e-04 

gi|l077300|pir| |S51848 hypothetical protein HRD1054 - yeast (Sa. . . 45 5e-04 

gij 2425143 (AF020407) WimA (Dictyostelium discoideum] 45 6e-04 

gij 1181961 |gnl|PID|e221260 (Z47794) collar protein [Bacteriopha. . . 45 6e-04 

gi|2132657 Jpir j |S64819 probable membrane protein YLL067c - yeas... 45 8e-04 

gi|2133041 jpir| JS65341 probable membrane protein YPR204W - yeas... 45 8e-04 

gi|730275|sp|P39793|PBPA_BACSU PENICILLIN -BINDING PROTEINS 1A/1... 45 8e-04 

Query* sid| 110165 | lan| 182ORF010 Phage 182 ORF) 1310-2155 | 2 
(281 letters) 

gi 1 135604 | sp | P06812 | TERM_BPNF DNA TERMINAL PROTEIN >gi | 75815 |pi .. . 69 3e-ll 

gi|l572478|gnl|PID|e242334 (X96987) terminal protein [Bacteriop. . . 65 3e-10 

gij 1429231 j gnl | PID|ell73405 (X99260) terminal protein [Bacterio... 64 le-09 

Query= sid| 110166 | lan | 182ORF011 Phage 182 ORF | 9607-10158 | 1 
(183 letters) 

gi 1 137928 | sp | P07537 | VG12_BPPZA PRE-NECK APPENDAGE PROTEIN (LATE... 51 6e-06 

gi j 1429241 |gnl ) PID|ell73415 (X99260) pre -neck appendage protein. . . 51 6e-06 

gijl37927|sp|P20345|VG12_BPPH2 PRE-NECK APPENDAGE PROTEIN (LATE... 50 le-05 



Query sid| 110169 | lan| 182ORF014 Phage 182 ORF| 13716-14108 | 3 
(130 letters) 

gi|l37936|sp|P11188|VG14 BPPH2 LYSIS PROTEIN (LATE PROTEIN GP14 .. . 97 6e-20 

gi j 137938 j spj P07539 j VG14~BPPZA LYSIS PROTEIN (LATE PROTEIN GP14 .. . 96 8e-20 

gij 1429243 |gnl | PID|ell73417 (X99260) lysis protein (Bacteriopha... 96 8e-20 

gij 215332 (M14782) lysis protein (Bacteriophage phi- 29] 94 5e-19 

Query- sid| 110170 | lan | 1B2ORF015 Phage 182 0RF| 854 -1225 | 2 
(123 letters) 
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gijl5670 (V01155) reading frame 10 (may be gene 4) [Bacteriopha. . . 70 5e-12 
gij 138072 |sp|P06953|VG5A_BPPZA EARLY PROTEIN GP5A >gi | 75836 | pir .. . 69 7e-12 

Query= sid | 110174 | lan| 1B2ORF019 Phage 182 ORF|4323-4613 |3 
(96 letters) 



gi| 1429235 |gnl | PID|ell73409 (X99260) head morphogenesis protein. . . 61 2e-09 

gi|l38111|sp|P13B48|VG7_BPPH2 HEAD MORPHOGENESIS PROTEIN (LATE ... 57 3e-08 

gi 1 138112 |sp|P07533|VG7_BPPZA HEAD MORPHOGENESIS PROTEIN (LATE ... 54 le-07 

Query* sid| 110180 | lan | 182ORF025 Phage 182 ORF| 548-814 | 2 
(88 letters) 

gi|l38099|sp|P06955|VG6 BPPZA EARLY PROTEIN GP6 >gi| 75841 |pir| | .. . 55 7e-08 

gi|l38098|sp|P03685|VG6~BPPH2 EARLY PROTEIN GP6 >gi | 75840|pir j | . . . 54 2e-07 

gi|1429234|gnl|PID|ell73408 (X99260) gene 6 product [Bacterioph. . . 54 2e-07 
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Table 25 

Homologies between 182 ORFs and proteins in public databases 



Phage: 162 
Database: Swissprot 

Query- sid| 110156 | lan| 182ORF001 Phage 182 ORF| 5966-7780(2 
(604 letters) 

gi|l38124|sp|P07534|VG9_BPPZA TAIL PROTEIN (LATE PROTEIN GP9) 
gi| 138123 |sp|P0433l|VG9 BPPH2 TAIL PROTEIN (LATE PROTEIN GP9) 
gi|2500030|sp|Q59968|CARA_SULSO CARBAMOYL- PHOSPHATE SYNTHASE SM. 

Query- sid| 110157 1 Ian 1 182ORF002 Phage 182 ORF| 2152 -3873 |l 
(573 letters) 



BPM2 DNA POLYMERASE 

BPPH2 DNA POLYMERASE (EARLY PROTEIN GP2) 
BPPZA DNA POLYMERASE (EARLY PROTEIN GP2) 
ASCIM PROBABLE DNA POLYMERASE 
NEUCR PROBABLE DNA POLYMERASE 
NEUIN PROBABLE DNA POLYMERASE 
BPPRD DNA POLYMERASE (PROTEIN PI) 
AGABT PROBABLE DNA POLYMERASE 
MAIZE DNA POLYMERASE (S-l DNA ORF 3) 



gi 1 118848 | sp j P19894 | DPOL_ 
gi | 118849( sp | P03680 | DPOL_ 
gi 1 118851 1 sp j P06950 j DPOL 
gi 1 118847 j sp| P22374 |DPOM 
gi j 461962 | sp | P33537 j DPOM 
gi j 461963 | sp | P33538 j DPOM 
gi | 118850 | sp | P10479 | DPOL_ 
gi 1 232013 | sp j P30322 j DPOM_ 
gi 1 118887 | sp j P10582 j DPOM_ 

Query- sid| 110159 1 lan| 182ORF004 Phage 182 ORF|4626-5954 | 3 
(442 letters) 

gi|l38117|sp|P13849|VG8 BPPH2 MAJOR HEAD PROTEIN (LATE PROTEIN . 
gi|l38U8|sp|P0753l|VG8~BPPZA MAJOR HEAD PROTEIN (LATE PROTEIN . 

Query* sid| 110160 1 lan 1 182ORF005 Phage 182 ORF j 12651-13700 1 3 
(349 letters) 

gi| 137932 |sp|P15132|VG13_BPPH2 MORPHOGENESIS PROTEIN 1 (LATE PR. 
gi|l37933|sp|P07538|VG13_BPPZA MORPHOGENESIS PROTEIN 1 (LATE PR. 

Query- sid| 110161 1 lan (182ORF006 Phage 182 ORF| 14995-16026 1 1 
(343 letters) 

gi|l37945|sp|P0754l|VG16 BPPZA ENCAPSULATION PROTEIN (LATE PROT. 
gi | 137944 |sp|P11014|VG16~BPPH2 ENCAPS I DAT I ON PROTEIN (LATE PROT. 

Query- sid| 110162 | lan | 182ORF007 Phage. 182 ORF | 7795-8775 | 1 
(326 letters) 

gi|l37915|sp|P07535|VG10 BPPZA UPPER COLLAR PROTEIN (CONNECTOR . 
gi|l37914|sp|P04332|VG10~BPPH2 UPPER COLLAR PROTEIN (CONNECTOR . 

Query- sid| 110163 | lan| 182ORF008 Phage 182 ORF| 14105-14983 1 2 
(292 letters) 

gi| 462559 |sp|P34020|LYC_CLOAB AUTOLYTIC LYSOZYME (1 , 4 -BETA-N-AC. 
gij 126652 |sp|P25310|LYCM STRGL LYSOZYME Ml PRECURSOR (1,4 -BETA-, 
gij 127789 jsp|P19386|LYCA~BPCP9 LYSOZYME (ENDOLYSIN) (MURAMIDASE. 
gi|l27787|sp|P15057|LYCAlBPCPl LYSOZYME (ENDOLYSIN) (MURAMIDASE. 
gi|l26597|sp|P0072ltLYCH CHASP N, O-DIACETYLMURAMIDASE (LYSOZYME. 
gi|l27788|8p|P19385|LYCA_BPCP7 LYSOZYME (ENDOLYSIN) (MURAMIDASE. 
gi|302S168|sp|P7642l|YEGX_ECOLI HYPOTHETICAL 32.0 KD PROTEIN IN. 

Query- sid| 110164 | lan| 182ORF009 Phage 182 ORF| 8765-9601 | 2 
(278 letters) 

gi|l3792l|sp|P04333|VGll BPPH2 LOWER COLLAR PROTEIN (LATE PROTE. 
gij 131782 |sp|P12753|RA50~YEAST DNA REPAIR PROTEIN RAD50 (153 KD. 
gij 1176490 |sp|P40889|YJW5 - YEAST HYPOTHETICAL 197.6 KD PROTEIN I. 
gij 731903 |spjP40434|YIR7_YEAST HYPOTHETICAL 197.5 KD PROTEIN IN. 
gi|730275|spjP39793|PBPA_BACSU PENICILLIN- BINDING PROTEINS 1A/1. 
gi|ll68610|sp|P41696|AZFl_YEAST ASPARAGINE-RICH ZINC FINGER PRO. 



384 
374 
49 



665 
654 
654 
71 
65 
62 
56 
46 
46 



309 
305 



52 
47 



402 
402 



256 
256 



75 
60 
60 
57 
57 
57 
53 



171 
48 
45 
45 
45 
44 



e-106 
e-103 
2e-05 



0.0 

0.0 

0.0 

7e-12 

3e-10 

3e-09 

2e-07 

2e-04 

2e-04 



6e-84 
7e-83 



2e-06 
6e-05 



e-112 
e-112 



3e-68 
5e-6B 



2e-13 
5e-09 
5e-09 
4e-08 
4e-08 
5e-08 
5e-07 



le-42 
2e-05 
le-04 
le-04 
2e-04 
3e-04 
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gi|731587|sp|P38900|YH19_YEAST HYPOTHETICAL 70.1 KD PROTEIN IN . . . 44 3e-04 

Query* sid| 110165 | lan| 182ORF010 Phage 182 ORF| 1310-2155 | 2 
(281 letters) 

gi| 135604 |sp|P06812|TERM_BPNF DNA TERMINAL PROTEIN 69 8e-12 

Query- aid| 110166 | lan | 182ORF011 Phage 182 ORF| 9607-10158 | 1 
(183 letters) 

gi 1 137928 |sp|P07S37|VG12_BPPZA PRE- NECK APPENDAGE PROTEIN (LATE... 51 2e-06 

gij 137927 |sp|P2034S|VG12_BPPH2 PRE -NECK APPENDAGE PROTEIN (LATE... 50 3e-06 

Query* sid| 110169 | lan | 182ORF014 Phage 182 0RF| 13716-14108 | 3 
(130 letters) 

gi|l37936|sp|P11188|VG14_BPPH2 LYSIS PROTEIN (LATE PROTEIN GP14) 97 2e-20 

gi|l37938|sp|P07539|VG14_BPPZA LYSIS PROTEIN (LATE PROTEIN GP14) 96 2e-20 

Query* sid| 110170 | lan| 182ORF015 Phage 182 ORF| 854-1225 | 2 
(123 letters) 

gi| 138072 |ap|P06953|VG5A_BPPZA EARLY PROTEIN GP5A 69 2e-12 

Query* sid | 110174 | lan | 182ORF019 Phage 182 ORF|4323-4613 |3 
(96 letters) 

gi|l3811l|sp|P13848|VG7_BPPH2 HEAD MORPHOGENESIS PROTEIN (LATE ... 57 9e-09 
gi 1 138112 j spj P07533 j VG7_BPPZA HEAD MORPHOGENESIS PROTEIN (LATE ... 54 4e-08 

Query* sid| 110180 | lan | 182ORF025 Phage 182 ORF| 548-814 | 2 
(88 letters) 

gi| 138099 |sp|P06955|VG6_BPPZA EARLY PROTEIN GP6 55 2e-08 

gill38098|sp|P03685|VG6_BPPH2 EARLY PROTEIN GP6 54 5e-08 
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BLASTP 2.0.8 (Jan-05-1999) 



Query- aid | 110156 | lan | 182ORF001 Phage 182 ORF| 5966-7780 | 2 
(604 letters) 

>gi| 138124 |sp|P07534|VG9__BPPZA TAIL PROTEIN (LATE PROTEIN GP9) 
>gi| 75849 jpirJ|WMBP9Z gene 9 protein - phage PZA 
>gi| 216058 (M11813) tail protein (Bacteriophage PZA] 
Length » 599 

Score = 384 bits (975), Expect =» e-105 

Identities = 231/610 (37%) , Positives * 344/610 (55%), Gaps » 36/610 (5%) 

Query: 6 TNVKLIANVPFDNTYTHTRWFKTQ^EQESY^ 65 

TNV++LA+VPF N Y +TRWF + Q ++FNS + E ++Q + V 
Sbjct: 9 TNVTIILADVPFSNDYKNTRWFTSSSNQYNWFNSOT 68 

Query: 66 KDALYACNYLIFKNEETTPSKWQYAFVTDIEYKNDNTSFV^^ 125 

D LY +Y++F+N + Y +KW YAFVT++EYKN T++V FEIDVLQT+ F+I +ESF 
Sbjct: 69 LDLLYNASYIMFQNAD - YGNKWFYAFVTELEYKNVGTTYVHFE IDVLQTWMFNIKFQESF 127 

Query: 126 IAKEHPQLYYSNGIPFINTIEESIJDYGREYTTTNVTrFHPNDGWFLVILTSEAM- - PVG 183 

I +BH +L+ +G P INTI+E L+YG EY +V P D + FLV+++ M G 
Sbjct: 128 I VR EHVKLWNDDGT FT INT IDEGLNYG SEYD I VS VENHRP YDDMMFL W I S K£ I MHGTAG 187 

Query: 184 DKEDKSG GSIVGGPSPFSYYLLPINSSGEVYKPN-GAGNANFGEYMAFLT TKEP 236 

+ E + S+ G P P YY+ P G+V K G NAN + LT ++ + 

Sbjct: 188 EAESRLm)INASI^GMPQPLCYYIHPFYKIX;KVPKTFIGDNNANI^PIVNMLTNIFSQ^ 247 

Query: 237 FLNKIVGrm/TSYTGIPFIvDHANKTWYNAGGSYKI^ 296 

♦N IV MYVT YG+ ++K+++ + + ADG+T VK+ 
Sbjct: 248 AVNN I VNMYVTDYIGLKLDYXNGDKELKLDKDMFEQAGI ADDKHGNVDTI F VKK 301 

Query: 297 ARTFVPKRIDLVGNVYNYFREAFPFNVKESKLFMYPYCLIEITDTKGHVMTLRPEYLTGG 356 

+ ID G+ + F + +ESKL MYPYC+ E+TD KG+ M L+ EY+ 

Sbjct: 302 I PDYETLE ID - TGDKWGGFTKD QESKLMMYPYCVTEVTDFKGNHMNLKTEYIDNN 355 

Query: 357 KLSVYVKGSLGISNKVMIEPIDYDVSNSTI 1 TNLSDKMLI DND PNDVGVKSDYASA 412 

KL + V+GS LG+SNKV DY+ S +T D LI+N-f PND+ + +DY SA 

Sbjct: 356 KLKIQVRGSLGVSNKVAYSIQDYNAGGSLSGGDRLTASLDTSLINNNPNDIAIINDYLSA 415 

Query: 413 FMQGNKNS LI AQEQNIRNTFRHGMGNSAMSTGGAI FSALASNNP FVGLTNIMGAGQQVNN 472 

++QGNKNSL Q+ +1 GM +S G ++ +PF ++♦ G N 
Sbjct: 416 YLQGNKNSLENQKSS I LFNGI VGMLGGGVSAG ASAVGRS PFGLASS VTGMTSTAGN 471 

Query: 473 YVSEKENGLNLLAGKVADIENI PDNVTQLGSNLSFTTGN- FQNYYQLRFKQIKYEYATRL 531 

V + + L K ADI NIP +T++G N +F GN ++ Y ++ KQ+K EY L 
Sbjct: 472 AVLD MQALQAKQADI ANI PPQLTKMGGNTAFDYGNGYRGVYVI K- KQLKAEYRRSL 526 

Query: 532 DRYFS^fYGTKSNRVATPNLQTRKAWNFIKLKEPNIVGTMSNDVLTRVKQIFSAGVTLWHT 591 

+F YG K NRV PNL+TRKA+N+I+ K+ I G ++N+ L ++ IF G+TLWHT 
Sbjct: 527 SSFFHKYGYKINRVKKPNLRTRKAYNYIQTKDCFISGDINNNDLQEIRTIFDNGITLWHT 586 

Query: 592 NDVLNYNQDN 601 

+D+ NY+ +N 
Sbjct: 587 DDIGNYSVEN 596 



Query* sid | 110157 | lan | 182ORF002 Phage 182 ORF | 2152-3873 | 1 
(573 letters) 

>gi|H8848|sp|P19894|DPOL_BPM2 DNA POLYMERASE >gi | 76896 | pir | | JQ0161 
DNA-directed DNA polymerase . (EC 2.7.7.7) - phage M2 
>gi | 215509 (M33144) DNA polymerase (Bacteriophage M2) 
Length « 572 

Score = 66S bits (1697), Expect = 0.0 

Identities » 327/589 (55%), Positives = 420/589 (70%), Gaps =» 38/589 (6%) 

Query: 3 KKYTGDFETTTDLNDCRWSWGVO)IDNVDN^ 62 
K ♦+ DFETTT L+DCRVW++G +1 N+DN G +D F +W M+ D-fYFHN KF 
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Sbjct: 4 KMFSCD F ETTTKLDDCR VWA YG YME I GNLDNYKIGNS LD EFMQWV - ME I QAD L Y FHN LKF 62 

Query: 63 DGEFMLSWLFKNGFKWCKEAKEDRTFSTLISNMGQWYALEICWEVNYXXXXXXXXXXXXX 122 

DG F+++WL ++GFKW E + T++T+IS MGQWY ++IC+ 
Sbjct: 63 DGAFI VNWLEQHGFKWSNEGLPN - TYNTI ISKMGQWYMIDICFGYK GKRKL 112 

Query: 123 XXI I YDS LKKYP FP VKQI AEAFNF P I KKG E 1 D YTKE R P I G YKPT KDEWE YLKND I Q I MAM 182 

+IYDSLKX PFPVK+IA+ F P+ KG+IDY ERP-K3++ T +E+EY+KNDI+I+A 
Sbjct: 113 HTVI YDS LKJCLP FP VTCKI AKD FQLPLLKGDI D YHTERPVGHEI T PEE YE YIKND I E I IAR 172 

Query: 183 AUCIQ FDQGLTRMTRGSDALGDYKDWLKATHG KSTFKQWF P I LS LGFDKDLRXAYKGG FT 242 

AL IQF QGL RMT GSD+L +KD L F + FP LSL DK++RKAY+GGFT 

Sbjct: 173 ALDIQFKQGLDRMTAGSDSLKGFKDILST KKFNKVFPKLSLPMDKEIRKAYRGGFT 228 

Query: 243 WVNKVFQGKEIGDGIVFDVNSLYPSQMYVRPLPYGTPLFYEGEYKPNNDYPLYIQNIKVR 302 

W+N ++ KEIG+G+VFDVNSLYPSQMY RPLPYG P+ ++G+Y+ ♦ YPLYIQ 1 + 
Sbjct: 229 WIjnDKYKEKEIGEGMVFDVNSLYPSQMYSRPLPYGAPIVFQGKYEKDEQYPLYIQRIRFE 288 

Query: 303 FRLKEGYIPTIQVKQSSLFIQNEYLESSVNKI/3\TOELIDLTLTNVDLELFFEHYDILEIH 362 

F LKEGYIPTIQ+K++ F NEYL++S GV E ++L LTNVDLEL EHY++ + 
Sbjct: 289 FELKEGYIPTIQIKKNPFFKGNEYLKNS GV - E P VELYLTNVDLELI QEHYELYNVE 343 

Query: 363 YTYGYMFKASCDMFKGWIDKWIEVWm'EGARKANAKGMLN^ 422 

Y G+ F+ +FK +IDKW VK EGA+K AK MLNSLYGKF +NPD+TGKVPY+ 
Sbjct: 344 YIDGFKFTIEKTGLFKDFIDKWTYVKTHEEGAKXQLAKLMLNSLYGKFASN 403 

Query: 423 G EDG I VRLT LGE EELRD P VYVPLAS FVT AWGR YTT I TT AQKCFDR 1 1 YCDTD S I HLVGTE 482 

+DG + -K5+EE +DPVY P+ F+TAW R+TTIT AQ C+DRI I YCDTDS IHL GTE 
Sbjct: 404 KDDGS LGFRVGDEEYKDP VYTPMG VF ITAWARFTT I TAAQACYDRI I YCDTDS I HLTGTE 463 

Query: 483 VPEA I DHLVD PKKLG YWGHESTFQRAXFI RQKT YVEEIDGEL 524 

VPE I +VD PKKLG YW HESTF+RAK++RQKT YV+E+DG+L 
Sbjct: 464 VPEIIKDIVDPKKLGYWAHESTFICRAKYIjRQKTYIQDIYVKEVDGKLKEC^ 523 

Query: 525 NVKCAGM PDR I KE I VT FDNFEVG FS S YG KLL PKRTQGG WL VDTMFT I K 573 

+VKCAGM D IK+ VTFDNF VGFSS GK P + GGWLVD++FTIK 
Sbjct: 524 S VKCAGMTDT I KKKVTFDNFAVG FSSMGKPKPVQVNGG WLVDS VFT I K 572 



Query- aid | 110159 | lan | 182ORF004 Phage 182 ORF| 4626-5954 | 3 
{442 letters) 

>gi| 138117 |sp|P13B49|VG8_BPPH2 MAJOR HEAD PROTEIN (LATE PROTEIN GP8) 
>gi| 7S845 |pir| |WMBP89 gene 8 protein - phage phi-29 
>gi| 215325 (M14782) major head protein [Bacteriophage 
phi-29] >gi | 225362 | prf | |1301270B gene 8 [Bacillus sp.] 
Length =448 

Score « 309 bit9 (783), Expect = 2e-83 

Identities = 176/440 (40%), Positives = 250/440 (56%), Gaps = 27/440 (6%) 

Query: 4 KITEQD VT^RATNVETPVQLMTAI YNS S SSLFQANVPMPNADN I EAVGAG XTRLDWKNE F 63 

♦IT DV + + ++ AI NS F++ VP+ A+N+ VGAGI V+N+F 

Sbjct: 2 RI TFNDVKTS LG ITESYD I VNAI RNSQGDNFKS YVP LATANNVAE VG AG I LINQTVQNDF 61 

Query: 64 ISTLVDRIGKWIRYKSWRNPUCMFKKGNMPLGRTIEEIFTO^ 123 

I ++LVDRIG WIR S NPLK FKKG +PLGRTIEEI+ DI +E +++ +E+ VF+ 
Sbjct: 62 ITSLVDRIGLWIRQVSIiNNPLFCKFKKGQIPLGRTIEEIYTDITKEKQYDAEEAEQKVFE 121 

Query: 124 QEVPDVKTLFHEINREGYYKQTIQEAWLEKAFTSWDNFNSFVAGVMNALYTGDEVSEFEY 183 

+E+P+VKTLFHE NR+G+Y QTIQ+ L+ AF SW NF SFV+ ++NA+Y EV E+EY 
Sbjct: 122 REMPNVKTLFHERNRQG FYHQT I QDDS LKTAFVS WGNFES FVSS 1 1 NAI YNS AEVDEYEY 181 

Query: 184 TKLLIANYQEKELFKEIEIGEITESNA- -KEFIRKIKSTSNKLEFM- -SSAYNAQGVKTS 239 

KLL+ NY K LF ++I B T S EF++K+++T+ KL S +N+ V+T 

Sbjct: 182 M KLLVDNYYS KGLFTTVK I D E PTS STGALTE FVKKMRAT ARKLTL PQGS RDWNS MAVRTR 241 

+ D + FNM++TDF+G+ VID F S+ + AV+V 

Sbjct: 242 S YMEDLHLI IDADI*EAEIJ}VDVLAKAFNMNRTDFIX5NVTVIDGF ASTGLEAVLV 295 

Query: 300 DSEWFMIYDKLYKTTSLYNPEGLYWNYWLHHHQLYSTSQFGNAVAFVKSATKPVTKVAFA 359 

D +WFM+YD L+K ++ NP GLYWNY+ H Q S S+F NAVAFV VT+V + 

Sbjct: 296 DKDWFMVYDN LH KMETVRN PRGL YWNYYYHVWQTLS VS R FANA VAFVSGDV P A VTQ VI VS 355 

Query: 360 SATTSVVKG SSKDI ALT FTPVEATNQQGEVVS SAPAL VKATVKQTAGKATA VTVEG LEVG 419 
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♦V +G + V ATN ♦ V 

Sbjct: 356 PN I AA VKQGGQQQFT AYVRATNAKDHKV - 

Query: 420 QS LVTFTA I GGQQATVLVTV 439 

L++ + Q TV TV 

Sbjct: 3 99 DG LLS VSGNEDNQLTVKATV 418 



V G +T + 
- VWSVEGGSTGTAI - 



G 

- -TG 398 



Query- aid) 110160 | lan (182ORF005 Phage 182 0RF| 12651-13700 | 3 
(349 letters) 

>gi| 137932 |sp|P15132|VG13_BPPH2 MORPHOGENESIS PROTEIN 1 (LATE 

PROTEIN GP13) >gi| 75858 |pir| [WMBP23 gene 13 protein - 
phage phi-29 >gi| 215331 (M14782) morphogenesis protein 
[Bacteriophage phi-29] >gi | 225368 |prf | | 1301270H gene 13 
[Bacteriophage phi-29] 
Length = 365 



Score « 51.5 bits (121), Expect » 8e-06 

Identities = 44/166 (26%) , Positives = 70/166 (41%) , Gaps 



14/166 (8%) 



Query: 6 NEQI ARGQTI AKI LSKYGWKNSQVGVVANLHWESA GLNPNSNEXXXXXXXXX - QWT 61 

+E Q I LS G+ K + G++ N+ ES GL N +E QWT 

Sbjct: 12 S EMKVNAQYI LNYLSSNGWTKQAI CGMLGNMQS EST INPGLWQNLDEGNTSLGPGLVQWT 71 

Query: 62 PKSNLYRQAQI CGLSNAKAETLEGQAE 1 1 AQGDKTGQWMDNTPVS S AG YTNPQTLSAFKQ 121 

P SN A GL II + + QW++ ++ Y K 
Sbjct: 72 PASNYINWANSQGLPYKDMDS - - ELKRI IWEVNNNAQWINLRDMTFKEY IKS 121 

Query: 122 SANIDVATINFMCHWERPGKLHIEERLDLAQAYSKHIDGSGGGGVK 167 

+ + F+ +ERP + ER D A+ + JC++ G GGGG++ 

Sbjct: 122 TKTPRELAMIFLASYERPANPNQPERGDQAEYWYKNLSGGGGGGLQ 167 



Query- sid | 110161 | lan| 182ORF006 Phage 182 ORF | 14995-16026 | 1 
(343 letters) 

>gi| 137945 |sp|P0754l|VG16_BPPZA ENCAPSIDATION PROTEIN (LATE PROTEIN 
GP16) >gi| 75861 |pir| (WMBP16 gene 16 protein - phage PZA 
>gi | 216065 (M11813) morphogenesis protein C 
(Bacteriophage PZA] 
Length =332 

Score = 402 bits (1023) , Expect = e-111 

Identities = 186/332 (56%), Positives = 244/332 (73%), Gaps =* 2/332 (0%) 

Query: 11 EKNLYYNPNNALGFNCLMLFVIGARGIGKTYGYKXFV^ 70 

+K+L+YNP L ++ ++ FVIGARGIGK+Y K + +NRFIK+GEQFIY+RR+K EL K 
Sbjct: 2 DKSLFYNPQKMLSYDRILNFVIGARGIGKSYAMKVYPINRFIKYGEQFIYVRRYKPELAK 61 

Query: 71 IPQFFKTMAKEFPDHKLEVKGKEFYCTDKLMGWAVPLSTWGIEKSN^ 130 

+ +F +A+EFPDH+L VKG+ FY D. KL GWA+PLS W EKSN YP V TI+FDEF+ 
Sbjct: 62 VSNYFNDVAQEF PDHELVVKGRRFYI DGKLAGWAI PLS VWQS EKSNAYPNVSTI VFDEFI 121 

Query: 131 I E KS K I TYL PNEAEAI*LNMMETVFRRRTNTRCVMLSNAT S WNP YFL YFNLQPDLNKRFN 190 

EK Y+PNE ALLN+M+TVFR R RO LSNA SWNPYFL+FNL PD+NKRFN 
Sbjct: 122 REKDNSNYI PNEVSALLNLMDTVFRNRERVRCI CLSNAVS WNPYFLFFNLVPDVNKRFN 181 

Query: 191 LYQDRGI LI ELCDSKDFAEVKRETPFGRLIRGTEYEDFS INNEFVNDSDTFI EKRSKNSS 250 

♦Y D LIB+ DS DF+ +R+T FGRLI GTEY + S++N+F+ DS FIEKRSK+S 
Sbjct: 182 VYDD--ALIEIPDSLDFSSERRKTRFGRLIDGTEYGEMSLDNQFIGDSHVFIEKRSKDSK 239 



Query: 251 



FLCAI AFEGKI EGYWIDAETGCVYVSYDYQPNTNHFYAMTTKDHEENRLLMKNWRNNY^^ 310 
F+ +1 + G G W+D G +YV + P+T + Y +TT D EN +L+ N++NNY+L 
Sbjct: 240 FVFSIVYNGFTI/3VWVTJVNQGIJ^YVI)TAHDPSTK^ 299 

Query: 311 STVAKAFKNSYLRFDNIVIKNLHYDLFNKMKI 342 

+A AF N YLRFDN VI+N+ Y+LF KM+I 
Sbjct: 300 RKLASAFMNGYLRFDNQVIRNIAYELFRKMRI 331 



Query- sid | 110162 | lan| 1B2ORF0O7 Phage 182 ORF| 7795-8775 | 1 
(326 letters) 



>gi|1429239|emb|CAA67658| (X99260) upper collar protein 
(Bacteriophage B103) 
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Length = 308 
Score = 271 bits (685) , Expect = 6e-72 

Identities = 131/275 (47%), Positives = 187/275 (67%), Gaps = 5/27S (1%) 

Query: 36 YYEH YRRQLTLLT FQL FEWENLPKS I D PRYLE I ALHTNG YLG FFKD PTLG FMVCAGAEDG 95 

+Y HY + L L +QLFEWE LP S+DP YLE ++H GY+GF+KDP +G++ C GA G 
Sbjct: 22 WYYHYYQYLCSLAYQLFEWERLPPSVDPSYLEKSIHQFGYVGFYKDPRIGYIACQGALSG 81 

Query: 96 QIDHYHNPIFFTANEAMYHKRYPVLRYDDDDDKSKCIML^ 155 

+DHY+ PFA+ Y ++YD +K* + +YNNDLK TLP+L FA D+A 
Sbjct: 82 TVDHYNL PDRFHAS S VG YQNTFKL YNYSDMKEKNMGVAI YNND LKCSTLPALEMF AQDLA 141 

Query: 156 D I NQ I S R VNRRAQ KT P VI I QTDEKKYF S LLQA YNQ I D ENNQ A VFVDKDME FD ES FNVWQT 215 

++ +1 VN+ AQKTPV+I ++ SL YNQ *■ N +FV + ++ D + V++T 
Sbjct: 142 ELKEI IAVNQNAQKTPVLIAANDNNQLSLKNIYNQYEGNAPVI FVHESLDLD-NLKVFKT 200 

Query: 216 NAPYVVDKLRSELNEVWNEVLTFTXUNNANTO^ 275 

+APYWDKL ++ N VWNEV+T+LGI NAN++K R+ TSEV SN+BQIESSGNI LK+R 
Sbjct: 201 DAPYVVDKLNAQKNAVWNEVMTYLGIKNAN^ 260 

Query: 276 KEFCDRVNRVFGDELDGKIDVKFRTDAVRQLQLAA 310 

+E C++++ ++G L VKFR D V Q-n-L A 

Sbjct: 261 QEACNKI SELYGLNL KVKFR YD I VEQMRLNA 291 

f 

Query* sid| 110163 | lan| 182ORF008 Phage 182 ORF J 14105-14983 1 2 
(292 letters) 

>gi| 4210750 |emb|CAA10710| (AJ132604) LysL protein [Lactococcus 
lactis] 
Length = 235 

Score » 139 bits (347), Expect = 2e-32 

Identities = 85/210 (40%), Positives = 114/210 (53%), Gaps = 14/210 (6%) 

Query: 2 MNGID I S S YQTG IDLS KV PCD FVN I KATGGTGYVNPDCDRAFQQALSLGKKIG VYHFAHE 61 

MNGIDISSYQ ++ VP DFV IKAT GT Y+NP + Q + K +G YHFA 
Sbjct: 1 MNGID I S S YQAELNAG I VP SDFVI I KATEGTNY INPTWE EQAGQV I QTNKLLG FYHFAS - 59 

Query: 62 RGLEGTPQQEAQFFLDNIKGYIGKAVLILDFEGS - -NQKDVNWAKAFLDYVYNKTGVKAW 119 

G P EA FF+ +K YIGKAVL+LDFE N A+ FL+ V KTG+ 

Sbjct: 60 VGNP I AEAD FFI S VVKNYIG KAVLVLDFEAGA I NAWGNVGARQFLNRVKEKTG I NPM 116 

Query: 120 FYT YTANLNTTDFS S I AKGD YGLWVAEYGSNQ PQG YS Q PAP PKTNN FPIVACFQF 174 

Y + ++S+I+ + LWVA+Y S P GY + P T+ + A Q+ 

Sbjct: 117 I YMSSDVTRQFNWSTISSTN - PLWVAQYASMNPTGYQ - - SEPHTDGKGYGAWSSAAXHQY 173 

Query: 175 TSKGRLPGYNGNLDLNVFYGDGNTWDLYVG 204 

+S G L ++GNLD+N+ Y + N W G 
Sbjct: 174 SSAGSLSNWSGNLDINLAYINANQWKSLAG 203 



Query* sid| 110164 | lan 1 182ORF009 Phage 182 ORF | 8765 - 9601 1 2 
(278 letters) 

>gi|1429240|emb|CAA67659| (X99260) lower collar protein 
[Bacteriophage B103] 
Length =293 

Score = 180 bits (451) , Expect * le-44 

Identities = 115/296 (38%), Positives = 161/296 (53%), Gaps » 33/296 (11%) 

Query: 3 LKRYIESFTYYQPELSRKERIEvGRKQLFDFDYPFYDETKRAEFETKFINHFYLREIGSE 62 

L YIE Y+ LS E+IE GR +LFDF YP +DE+ R FET FI +FY+REIG E 
Sbjct: 8 LST Y I EMWSQYETG LSNAEKI EKGRPKLFDFQYP I FDES YRKVFETHF I RNF YMRE I G FE 67 

Query: 63 TMGS FKFNLDE YLNLNMP YWNKMFLSNLEE F - PI FDDMD YT I D EKQKLLN E I DTN I KANR 121 

T G FKFNL+ +L +NMPY+NK+F S L +♦ P+ ♦ T K+ DT NR 

Sbjct: 68 TEGLFKFNLETWLI INMPYFNKLFESELI KYDPLENTRLNTTGNKKN DTERNDNR 122 

Query: 122 D ESKNQTKQVDQTDNRNKNTRDTGTT DSFSRNTYTDTPQKDLRIASNG 169 

D + K+ TK D+T+ + D TT D+F+R +D P L + +N 
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Sbjct: 123 DTTGSMKADGKSNTKTSDKTNATGSSKEDGKTTGS 181 

Query: 170 DGTGVINYATNITEDLSKETTSSTGVETNNDKTNQNTRSNAS EKETKNTD 219 

DQ G + YA+ I E+ + ++TG TNN +♦ ♦ S S T N 

Sbjct: 182 DGQGTLEYASAIEENNTNNKRNTTG - -TNNVTSSAESESTGSGTSDTVTTDNANTTTNDK 239 

Query: 220 INKDQNQTKDTITRYKGKKGNTDYADLLEKYRRS VLRI EKMI FREMNKEGLFLLVY 275 

+ N N +D I GK G YA L++ YR ++LRIEK IP EM + LF+LVY 
Sbjct: 240 LNSQINNVEDYIESKIGKSGTQSYASLVQDYRAALLRIEKRI FDEMQE - - LFMLVY 293 

Query sid|U0165|lan|lS2ORF010 Phage 182 ORF| 1310-2155 | 2 
(281 letters) 

>gi| 135604 |sp|P06812|TERM_BPNF DNA TERMINAL PROTEIN 

>gi 1 75815 1 pir | | ERBPNP terminal protein - phage NF 
>gi|579177|emb|CAA68440| (Y00363) gene E product (AA 
1-267) (Bacteriophage NF] 
Length =266 

score * 74.9 bits (181), Expect =» 6e-13 

Identities « 73/275 (26%), Positives = 129/275 (46%), Gaps = 37/275 (13%) 

VRISKNDRAKLEKIYGKSNKARKKYNRLRQK-GVE- - - ERQLPTVPTSKKRLIDYVKSTN 58 
+RI+ ND+A K+ K+ KA K +R ++K G++ E +LP + + + 
IRITNNDKALYAKLV-KNTKA- -KISRTKKKYGIDLSNEIELPPLESFQ 52 

MSRSDFNKMLDELVDFAQPYNENYI FEINKRNVAI SRAQIKEAQI KTEQAQKAKEEHYKE 118 

+R +FNK + F N+NY F NK + S+A+I E T++AQ+ +E +E 
- TREE FNKWKQKQES FTNRANQNYQFVKNKYG I VASKAKINE I AKNTKEAQR I VDEQREE 111 

L NKVEVKKPTENTIVTPTILTELGADLPFQAI PDFNIDAFTSPEGVQSYLEN 170 

+ K + I++P+ +T G P DFN D S +++ E 

IEDKPFISGGKQQGTVGQRMQILSPSQVT- -GISRP SDFNFDDVRSYARLRTLEEG 165 

IG - KQDEQYFD ERDQLYYDNFRQAMFTI FNSD - - ADDI VRLLDSMGLDLFMKTYVSNFLD 227 
+ K Y+D R + NF + + FNSD +D++V h ♦ DF + Y+ F + 



++ +Y EE + B + +KI ++ G+V 



Query: 


3 


Sbjct: 


7 


Query: 


59 


Sbjct: 


53 


Query: 


119 


Sbjct: 


112 


Query: 


171 


Sbjct: 


166 


Query: 


228 


Sbjct: 


225 



Query- sid| 110166 1 lan 1 182ORF011 Phage 182 ORF | 9607-10158 1 1 
(183 letters) 

>gi| 1429241 |ernb|CAA67660| (X99260) pre-neck appendage protein 
[Bacteriophage B103] 
Length » 860 

Score = 50.8 bits (119), Expect - 6e-06 

Identities » 29/105 (27%), Positives = 56/105 (52%), Gaps = 6/105 (5%) 

Query: 8 KRFDGLPAVFKERFSKYPHTEYRYELXLJDEEVSALIAYI^ 67 

+RF+ L + + + +YT+ +L E+++ +1 YLN++G L ND+ N +E 
Sbjct: 7 RRFEKLGEMMVQVYERYLPTAFDESMTLIJSKMNKIIEYLN^ 66 

Query: 68 V-EKI^EITNI)TIJaCWI^DGTLENLINDTVFANYIKEI 111 

+ + LE+ +TL+KW +G +L+ I E+K+ + V 

Sbjct: 67 LNDGLEDYVKETLEKWYEEGKFADLV 1 QVI DELKQFGVS V 106 



Query- sid| 110169 | lan| 182ORF014 Phage 182 ORF | 13716-14108 | 3 
(130 letters) 

>gi|l37936|sp|P11188|VG14_BPPH2 LYSIS PROTEIN (LATE PROTEIN GP14) 
>gi|75860|pir| |WMBP29 gene 14 protein - phage phi-29 
>gi|15678|emb|CAA28631| (X04962) gene 14 product (AA 
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1-393) [Bacteriophage phi-29] >gi | 225369 |prf | | X301270J 
gene 14 {Bacteriophage phi -29 J 
Length =131 

Score » 96.7 bits (237), Expect » 6e-20 

Identities - 53/131 (40%) , Positives = 81/131 (61%), Gaps = 3/131 (2%) 

Query: 1 M I EYI TQWL - ADDNHLVYGLI I WLMVAM I IDFVI/3FT I AKFNKB ID FSS FKAKAG 1 1 VKV 59 

MI ++ +L D+ L+Y L +LMV M++D VLG AK N I FSSFK K G+++KV 
Sbjct: 3 MIAWMQHFLETDETKLIYWLT-FLMVCMVVT)TV^ 61 

Query: 60 AEMVLWYF I P VAVKFGAVG ITMYI TML VGLI LSE I YS I LGH I SD I DDDNNWTD YVKKFL 119 

+EM+L + IP AV F A G+ + T+ L +SEIYSI GH+ +DD ♦++ + ++ F 
Sbjct: 62 SEMILALLAI PFAVPFPA-GLPLLYTVYTALCVSEI YS I FGHLRLVDDKSDFLEILENFF 120 

Query: 120 DGTLNRKDD I K 130 

T + + K 
Sbjct: 121 KRTSGKNKEEK 131 



Query- sid| 110170 | lan | 182ORF015 Phage 182 ORF|854-1225|2 
(123 letters) 

>gi|15670|etnb|CAA24483| (V01155) reading frame 10 (may be gene 4) 
(Bacteriophage phi-29] 
Length « 124 

Score = 69.9 bits (168), Expect = 6e-12 

Identities = 39/119 (32%), Positives « 64/119 (53%), Gaps = 3/119 (2%) 

Query: 3 I VKST FDT QT P EGMLQ VFNATNGAS I PLRNAI -GEVLELKDI LVYSDEVSGFGGAEPSQA 61 

IVK+TFDT+T EG +++FNA G +N G ++E I Y +G A+ + 

Sbjct: 6 IVTCATFDTETLEGQIKIFNAOJGGGQSFKNIjPDGTIIEANAIAQYKQVSDTYGDAK--EE 63 

Query: 62 ELVAFFTEDGKT YAG VS AVATKS AKNL I DMMTANPD I K PKI S FVEGKSNGGQKFVNLQV 120 

+ F DG Y+ +S ++A +LID++T + K+ V+G S+ G F +LQ+ 

Sbjct: 64 TVTTIFAADGSLYSAISKTVAEAASDLIDLVTRHKLETFKVKWQGTSSKGNVFFSLQL 122 



Query« sid| 110174 1 lan| 182ORF019 Phage 182 ORF|4323-4613 |3 
(96 letters) 

>gi| 1429235 |emb|CAA67654| (X99260) head morphogenesis protein 
(Bacteriophage B103) 
Length = 10 l 

Score « 60.9 bits (145), Expect = le-09 

Identities ■ 34/96 (35%), Positives = 53/96 (54%), Gaps = 5/96 (5%) 

Query: 1 ME I KEHE S I LNG I LES VTDG EAR SKI VEHLEALREDYGATT EALT S ANSTLE KLKKDNEA 60 

MB HE ILN + + + R++ + L+ LR DYG+ + S EKL+ +N 

Sbjct: 3 MERDSHEE I LNKLND P ELEHS ERTEL LQQLRAD YG S VLS E FS E LTSATE KLRAENSD 59 

Query: 61 LVI SNS KLFRERA I VE PAEN - - NEPETDQNI TLDDL 94 

L++SNSKLFR+ I + E ♦ E + IT++DL 
Sbjct: 60 L I VSNS KLFRQ VG ITKEKEEE I KQEELSET ITI EDL 95 



Queryo aid | 110180 | lan | 182ORF02 5 Phage 182 ORF| 548-814 | 2 
(88 letters) 

>gi| 138099 |sp|P06 955 |VG6_BPPZA EARLY PROTEIN GP6 

>gi| 75841 |pir|jERBP6Z gene 6 protein - phage PZA 
>gi|21604 7 (Ml 18 13) gene 6 product [Bacteriophage PZA] 
>gi|224746|prf | |1112171K ORF 6 [Bacteriophage PZA] 
Length =96 

Score ■ 55.0 bits (130), Expect * 8e-08 
Identities = 28/79 (35%), Positives * 45/79 (56%) 
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Query: 4 K1J4QRNVTSTKVEFSEVIVQDGAPTIVPCEPVVLTGKI^EEKALSAIKRKNPDKNVVVTN 63 

K+MQR +T T V «•♦■».+■♦■ DG + G LS E+A +KRK + V V + 

Sbjct: 3 KMMQRE I TKTTVNVAKMVMVIX3E VQVEQLPS ETFVGNIiSMEQAQVniMKRKYKGE P VQVVS 62 

Query; 64 VSHETALYTMPVDKFIELA 82 

V T +Y +PV+KF+E+A 
Sbjct: 63 VEPNTEVYELPVEKFLEVA 81 
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Table 26 

Secondary structure prediction for ORF 182ORF008 

1 MMNGIDISSY QTGIDLSKVP CDFVNIKATG GTGYVNPDCD RAFQQALSLG KKIGVYHFAH 

CCCCCCCCCC CCCCCCCCCC CCEEEEEECC CCCCCCCCCC HHHHHHHHHC CCCCEEEEEE 

61 ERGLEGTPQQ EAQFFLDNIK GYIGKAVLIL DFEGSNQKDV NWAKAFLDYV YNKTGVKAWF 

CCCCCCCCHH HHHHHHHHHC CCCCEEEEEE CCCCCCCHHH HHHHHHHHHH HCCCCCEEEE 

121 YTYTANLNTT DFSSIAKGDY GLWVAEYGSN QPQGYSQPAP PKTNNFPIVA CFQFTSKGRL 

EEECCCCCCC CCCEECCCCC CEEEEECCCC CCCCCCCCCC CCCCCCCEEE EEEECCCCCC 

181 PGYNGNLDLN VFYGDGNTWD LYVGKKQDQI VPPENKIFDA TSDEFIFTLT TGSTSVFYFD 

CCCCCCCCEE EEECCCCCCE EEECCCCCCC CCCCCCCCCC CCCEEEEEEC CCCCEEEECC 

241 GETIFELSDP TQLDHIRGTY NHVHGKEIPS MVWTPEQFDI YLKMYEKKPV YK 

CCEEEECCCC CCHHHHCCEE CCCCCCEECC CCCCCCCHHH HHHHHCCCCE EC 



Secondary structure prediction for ORF 182ORF014 



1 MIEYITQWLA DDNHLVYGLI IWLMVAMIID 
CCCCEECCCC CCCCHHHHHH HHHHHHHHHH 
61 EMVLWYFIP VAVKFGAVGI TMYITMLVGL 
EEEEEEEECC CEEECCCEEE EEEEEEEEEE 
121 GTLNRKDDIK 
CCCCCCCEEC 



FVLGFTIAKF NKEIDFSSFK AKAGIIVKVA 
HHHHHHHHHC CCCCCHHHHH HHHCEEEEEE 
ILSEIYSILG HISDIDDDNN WTDYVKKFLD 
EEEEEEEECC CCCCCCCCCC CEEEEEEECC 



WO 00/32825 



PCT/IB99/02040 



342 
Table 27 

Enterococcus accession numbers 242/242 



gi|289575 1 |gb| AF044978. 1 |AF044978 [289575 1 ] 

gi|4803755|dbj|AB026843.1|AB026843 [4803755] 

gi|4769001 |gb|AFl 40549. 1 |AF140549 [4769001 ] 

gi|4760901 |gb|AF099088. 1 |AF099088 [4760901 ] 

gi |4704705 |gb| AF 1 2 1 254. 1 |AF 1 2 1 254 [4704705] 

gi|3342117|gb|AF076604.1|AF076604 [3342117] 

gi|4688824|emb|AJ132470.1|ESP132470 
[4688824] 

gi|4732085|gb|AF125553.1|AF125553 [4732085] 

gi|4732082|gb|AF125552.1|AF125552 [4732082] 

gi|4732079|gb|AFl 2555 1 . 1 |AF1 2555 1 [4732079] 

gi|4732076|gb|AFl 25550. 1 |AFI 25550 [4732076] 

gi|4732073 |gb|AFl 25548. 1 1 AF 1 25548 [4732073] 

gi|4732070|gb|AFl 25547. 1|AF125547 [4732070] 

gi|4732067|gb|AF125546.1|AF125546 [4732067] 

gi|4732064|gb|AF125545.1|AF125545 [4732064] 

gi|473206 1 |gb| AF 1 25544 . 1 1 AF 1 25544 [473206 1 ] 

gi|4704653|gb|AF114715.1|AFl 14715 [4704653] 

gi|4704564|gb|AF102550.1|AF102550 [4704564] 

gi|4688827|emb|AJ238249. 1 (EFA238249 
[4688827] 

gi|4680606|gb| AF1 25 1 98. 1 |AF1 25 1 98 [4680606] 

gi|4633279|gb|AF117609.1|AFl 17609 [4633279] 

gi|4633124|gb|AFl 10130.1|AF1 10130 [4633124] 

gi|4590399|gb|AF124258.1|AF124258 [4590399] 

gi|4590336|gb|AF108380.1|AF108380 [4590336] 

gi|4590335|gb|AF108379.1|AF108379 [4590335] 

gi|40 19 1 67|gb|U2 1 300. 1 |CXU2 1 300 [4019167] 

gi|4545122|gb|AF077816.1|AF077816 [4545122] 

gi|4433610|gb|AF106614.1|AF106614 [4433610] 

gi|4468838|emb| AJ 1 32039. 1 |EFA 1 32039 
[4468838] 

gi|4468 1 2 1 |emb|AJ 1 32958. 1 |BPH 1 32958 
[4468121] 

gi|4456104|emb|Y17302.1|EHI17302 [4456104] 
gi|44336 1 1 |gb| AF 1 066 1 5. 1 1 AF 1 066 1 5 [44336 1 1 ] 
gi|4433607|gb|AF 10661 1.1|AF10661 1 [4433607] 



gi|4098267|gb|U76614.1|BLU76614 [4098267] 
gi|47019|emb|Y00116.1|SFAMBl [47019] 
gi|4158179|emb|AL035206.1|SC9B5 [4158179] 
gi|4165458|emb|X79343.1|EF16SSPA [4165458] 
gi|4165457|emb|X79342.1|EFTRNALA [4165457] 
gi|4165456|emb|X79341.1!EF23SRNA [4165456] 
gi|4150978|emb|Y14027.1|EFY14027 [4150978] 
gi|4127803|emb|AJ223161.1|EFAJ3161 [4127803] 
gi|2956685|emb|Y16413.1|EFENTIJO [2956685] 
gi|2665346|emb| Y 1 3922. 1 |EHY1 3922 [2665346] 
gi|4324675|gb|AF109375.1|AF109375 [4324675] 
gi|4234627|gb|AF0610 13 . 1 |AF061013 [4234627] 
gi|4234626|gb|AF061012.1|AF061012 [4234626] 
gi|4234625 |gb| AF06 1011.1 1 AF06 1011 [4234625] 
gi|4234624|gb|AF061010.1|AF061010 [4234624] 
gi|4234623|gb|AF061009.1|AF061009 [4234623] 
gi|4234622|gb|AF061O08. 1 |AF061008 [4234622] 
gi|4234621 |gb|AF061 007. 1 |AF061007 [4234621 ] 
gi|4234620|gb(AF061006.1|AF061006 [4234620] 
gi|4234619|gb|AF061005.1|AF061005 [4234619] 
gi|4234618|gb|AF061004.1|AF061004 [4234618] 
gi|42346 1 7|gb| AF061 003 . 1 1 AF06 1 003 [42346 1 7] 
gi|42346 l6|gb|AF061 002. 1 |AF061 002 [42346 1 6] 
gi|42346 15|gb|AF061 001 . 1 |AF061001 [423461 5] 
gi|4234614|gb|AF061000.1|AF061000 [4234614] 
gi|3 138990|gb|AF060241 . 1 |AF060241 [3 138990] 
gi|3 1 3 8986|gb|AF060240. 1 |AF060240 [3 1 38986] 
gi|4204535|gb|AF094803. 1|AF094803 [4204535] 
gi|4204534|gb|AF094802.1|AF094802 [4204534] 
gi|4204533|gb|AF094801 . 1 |AF094801 [4204533] 
gi|4204532|gb|AF094800. 1 1 AF094800 [4204532] 
gi|420453 1 |gb| AF094799. 1 |AF0_94799 [4204534} 
gi|4204530|gb|AF094798.1|AF094798 [4204530] 
gi|4204529|gb|AF094797.1|AF094797 [4204529] 
gi|4204528|gb|AF094796.1|AF094796 [4204528] 
gi|4204527|gb|AF094795. 1 |AF094795 [4204527] 
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gi|4204526|gb|AF094794. 1|AF094794 [4204526] 

gi|4204525|gb|AF094793.1|AF094793 [4204525] 

gi|4204524|gb|AF094792. 1 |AF094792 [4204524] 

gi|4204523|gb|AF094791.1|AF094791 [4204523] 

gi|4204522|gb|AF094790. 1 |AF094790 [4204522] 

gi|4204521|gb|AF094789.1|AF094789 [4204521] 

gi|4204520|gb|AF094788. 1 |AF094788 [4204520] 

gi|42045 l9|gb|AF094787. 1 |AF094787 [42045 19] 

gi|42045 1 8|gb| AF094786. 1 1 AF094786 [42045 1 8] 

gi|4204517|gb|AF094785.1|AF094785 [4204517] 

gi|42045 16|gb|AF094784. 1 |AF094784 [42045 16] 

gi)42045 15|gb|AF094783. 1 [AF094783 [42045 15] 

gi|42045 14|gb|AF094782. 1 |AF094782 [42045 14] 

gi|4204513|gb|AF094781.1|AF094781 [4204513] 

gi|42045 12|gb|AF094780. 1 |AF094780 [42045 1 2] 

gi|3873 1 86|gb|AF034779. 1 |AF034779 [3 873 1 86] 

gi|4151367|gb|AF093508.1|AF093508 [4151367] 

gi|2828136|gb|AF039903.i|AF039903 [2828136] 

gi|2828135|gb|AF039902.1|AF039902 [2828135] 

gi|2828 1 34|gb|AF03990 1 . 1 |AF039901 [2828 1 34] 

gi|2828133|gb|AF039900.1|AF039900 [2828133] 

gi|2828 1 32|gb|AF039899. 1 [AF039899 [2828 1 32] 

gi|2828131|gb|AF039898.1|AF039898 [2828131] 

gi|4 1 03866|gb|AF0288 1 2. 1 |AF0288 12 [4 1 03866] 

gi|4 1 03864|gb|AF0288 11.1 |AF0288 1 1 [4 103864] 

gi|2605925|gb|AF029727. 1 [AF029727 [2605925] 

gi|1402750|gb|U60038.1|EFU60038 [1402750] 

gi| 1 835780|gb|U86375. 1 |EFU86375 [1835780] 

gi|383 1 555|gb|AF047608. 1 |AF047608 [383 1 555] 

gi|3790617|gb|AF0974 14. 1 |AF0974 14 [379061 7] 

gi|3767587|dbj|AB005036.1|AB005036 [3767587] 

gi|3757810|gb|AF042288.1|AF042288 [3757810] 

! gi|3747039|gb|AF093509. 1 |AF093509 [3747039] 

' gi|3660559Idbj|AB01781 1.1|AB01781 1 [3660559] 

gi|l 147743|gb|U4221 1.1|EHU4221 1 [1 147743] 

gi|36764 1 2|gb| AF05 1917.1 1 AF05 1 9 1 7 [3 6764 1 2] 

gi|3676 1 64|emb| AJO 11113.1 |EFA0 11113 
| [3676164] 

| gi|2612869|gb|AF005726.1|AF005726 [2612869] 

! gi|2353762|gb|AF016233.1|AF016233 [2353762] 

I 

! 



gi|2 1 49899|gb|U94707. 1 |EFU94707 [2 1 49899] 
gi|2149149|gb|U82366.1|LSU82366 [2149149] 
gi| 1 469463|gb|U495 1 2. 1 |EFU495 12 [1469463] 
gi| 1 244503|gb|U35366. 1 |EFU35366 [ 1 244503] 
gi|833854|gb|U26268.1|EFU26268 [833854] 
gi|841200|gb|U18931.1|CPU18931 [841200] 
gi|460079|gb|U00457. 1 |U00457 [460079] 
gi|460077|gb|U00456.1|U00456 [460077] 
gi|535661 |gb|L34675. 1 |INSTRANSPO [535661 ] 
gi|302304 1 |gb|AF007787. 1 [AF007787 [302304 1 ] 
gi|431124|gb|L15633.1|TRN916ENT [431124] 
gi|388106|gb|L23802.1|ENEEBSA [388106] 
gi|3608387|gb|AF071085.1|AF071085 [3608387] 
gi|355 1 85 1 |gb|AF076027. 1 |AF076027 [355 1 85 1 ] 
gi|3551773|gb|U94770.1|SPU94770 [3551773] 
gi|3551743|gb|U57498.1|ECU57498 [3551743] 
gi|3243178|gb|AF063010.1|AF063010 [3243178] 
gi|3 1 363 16|gb|AF063900. 1|AF063900 [3 1363 16] 
gi|3540256|gb|AF052459.1|AF052459 [3540256] 
gi|75 52 1 5|gb|U 1 7696. 1 |LLU 1 7696 [7552 1 5] 
gi|342 1 437|gb|AF082295. 1 1 AF082295 [342 1437] 
gi|342 1 436|gb|AF082294. 1 (AF082294 [3421436] 
gi|3421435|gb|AF082293.1|AF082293 [3421435] 
gi|3421434|gb|AF082292.1|AF082292 [3421434] 
gi|3341430|emb|Y17797.1|EFY17797 [3341430] 
gi|33 1 9647|emb|X69092. 1 |EHPBP3RA [33 1 9647] 
gi|3292886|emb|AJ007584.1|EFA7584 [3292886] 
gi|3261536|emb|AL021958.1|MTV041 [3261536] 
gi|3250708|emb|Z95 150. 1|MTCYI64 [3250708] 
gi|3249688|gb|AF070678. 1(AF070678 [3249688] 
gi|3249687|gb|AF070677. 1 |AF070677 [3249687] 
gi|3249686|gb|AF070676. 1 |AF070676 [3249686] 
gi[32 1 9 1 58|dbj| ABO 1 5233. 1 1 ABO 1 5233 [3219158] 
gi|2765275|emb| Yl 2924. 1 |SPY 1 2924 [2765275] 
gi|3 1 83 687|emb| Y 1 1 62 1 . 1 |EA 1 6SRRN [3 1 83687] 
gi|2765274|emb|Y12923.1|EFYl2923 [2765274j - 
gi|2765273|emb|Y12922.1|ESY12922 [2765273] 
gi|2765272|emb| Y 1 292 1 . 1 |ES Y 1 292 1 [2765272] 
gi|276527 1 |emb| Y 1 2920. 1 |EDY 1 2920 [276527 1 ] 
gi|2765270|emb|Y 12919.1 |ESY 1 29 19 [2765270] 
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gi|2765269|emb|Y12918.1|ECY12918 [2765269] 
gi|2765268|emb|Y12917.1|ECY12917 [2765268] 
gi|2765267|emb|Y12916.1|EPY12916 [2765267] 
gi|2765266|emb|Y12915.l|ESY12915 [2765266] 
gi|2765265|emb|Y12914.1|ERY12914 [2765265] 
gi|2765264|emb|Y12913.l|EMY12913 [2765264] 
gi|2765263|emb|Y12912.1|EHY12912 [2765263] 
gi|2765262|emb|Y12911.1|EMY12911 [2765262] 
gi|2765261|emb|Y12910.1|EGY12910[2765261] 
gi|2765260|emb|Y12909.1|EDY12909 [2765260] 
gi|2765259|erab|Y12908.1|ECY12908 [2765259] 
gi|2765258|cmb|Y12907.1|EAY12907 [2765258] 
gi|2765257|emb|Y12906.1|EFY12906 [2765257] 
gi|2765256|emb|Y12905.1|EFY12905 [2765256] 
gi|2894541|erab|AJ223332.1|EFAJ3332 [2894541] 
gi|2894539 jemb|AJ22333 1 . 1 |EFAJ333 1 [2894539] 
gi|3108O58|gb|AFO60881.1|AF060881 [3108058] 
gi|3087776|emb|AJ223633. 1 |EFAJ3633 [3087776] 
gi|3080754|gb|AF016483. 1 |AF01 6483 [3080754] 
gi|2 1971 1 9|gb|AF00392 1 . 1 |AF003921 [21971 19] 
gi|2982722|dbj|AB012213.1|AB012213 [2982722] 
gi|2982721|dbj|AB012212.1|AB012212 [2982721] 
gi|2058780|gb|B07890. 1 |B07890 [2058780] 
gi|2058779|gb|B07889. 1 (B07889 [2058779] 
gi|2058778|gb|B07888. 1 |B07888 [2058778] 
gi|2058777|gb|B07887.1 |B07887 [2058777] 
gi|2058776|gb|B07886. 1 |B07886 [2058776] 
gi|2058775|gb|B07885. 1 |B07885 [2058775] 
gi|2058774|gb|B07884.1 |B07884 [2058774] 
gi|2058773|gb|B07873.1 |B07873 [2058773] 
gi|2058772|gb|B07872.1|B07872 [2058772] 
gi|205877 1 |gb|B0787 1 . 1 |B0787 1 [205877 1 ] 
gi|2058770|gb|B07870. 1 |B07870 [2058770] 
gi|2058769|gb|B07869.1|B07869 [2058769] 
gi|2058768|gb|B07868. 1 |B07868 [2058768] 
gi|2058767|gb|B07867. 1 |B07867 [2058767] 
gi|2058766|gb|B07866. 1|B07866 [2058766] 
gi|2058765|gb|B07865. 1 (B07865 [2058765] 
gi|2058764|gb|B07864.1|B07864 [2058764] 
gi|2058763|gb|B07883. 1 (B07883 [2058763] 



gi|2058762|gb|B07882. 1 |B07882 [2058762] 

gi|2058761 |gb|B0788 1.1 |B0788 1 [2058761 ] 

gi|2058760|gb|B07880.1 IB07880 [2058760] 

gi|2058759|gb|B07879.1|B07879 [2058759] 

gi|2058758|gb|B07878. 1 |B07878 [2058758] 

gi|2058757|gb|B07877. 1 |B07877 [2058757] 

gi|2058756|gb|B07876. 1 |B07876 [2058756] 

gi|2058755|gb|B07875.1|B07875 [2058755] 

gi|2058754|gb|B07874.1|B07874 [2058754] 

gi|2058753|gb|B07863. 1 |B07863 [2058753] 

gi|2058752|gb|B07862. 1 |B07862 [2058752] 

gi|205875 1 |gb|B0786 1 . 1 |B07861 [205875 1] 

gi|2058750|gb|B07860. 1 |B07860 [2058750] 

gi|2058749|gb|B07859. 1 |B07859 [2058749] 

gi|2058748|gb|B07858.1|B07858 [2058748] 

gi|2058747|gb|B07857. 1 |B07857 [2058747] 

gi|2058746|gb|B07856. 1|B07856 [2058746] 

gi|2058745|gb|B07855.1|B07855 [2058745] 

gi|2058744|gb|B07854.1|B07854 [2058744] 

gi|2058743|gbjB07853.1|B07853 [2058743] 

gi|2058742|gb|B07852. 1 |B07852 [2058742] 

gi|2058741 |gb|B0785 1 . 1 |B07851 [205874 1] 

gi|2058740|gb|B07850. 1 |B07850 [2058740] 

gi|2947527|gb|T25933. 1|T25933 [2947527] 

gi|2924302|emb|X81655.1|EHERMAM [2924302] 

gi|2664256|emb| Yl 2234. 1 (EFAS48C [2664256] 

gi|2879906|dbj|D85752. 1 |D85752 [2879906] 

gi|2746216|gb|AF028836.1|AF028836 [2746216] 

gi|2745825|gb|AF039139.1|AF039139 [2745825] 

gi|2696019|dbj|AB007844.1|AB007844 [2696019] 

gi|48999|emb|X62280.1|EHPBP5G [48999] 

gi|2654477|gb|U899 14. 1 (BFU89914 [2654477] 

gi(43347|emb|X68646.1|EHPSRAA [43347] 

gi|2613034|gb|AH005624.1|SEG^EDDH4RR 
[2613034] 

gi|2613033|gb|AF029775.1|EDDH4RR2 [2613(133] 

gi|2613032|gb|AF029774.1|EDDH4RRl [2613032] 

gi|2613031|gb|AH005623.1|SEG - EDDHIRR 
[2613031] 

gi|26 1 3030|gb| AF029773 . 1 |EDDHIRR2 [26 1 3030] 
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gi|2613029|gb|AF029772.1|EDDHIRRl [2613029] 

gi|2613028|gb|AH005622.1|SEG_EDH19RR 
[2613028] 

gi|2613027|gb|AF029771.1|EDH19RR2 [2613027] 

gi|2613026|gb|AF029770.1|EDH19RRl [2613026] 

gi|2613025|gb|AH005621.1|SEG_EDISRR 
[2613025] 

gi|26 1 3024|gb|AF029769. 1 |EDISRR2 [26 1 3024] 

gi|2613023|gb|AF029768.1|EDISRRi [2613023] 

gi| 1 88 1 226|dbj[ AB00 1 488 . 1 1 AB00 1488 [1881226] 

gi|2547 1 60|gb| AF023 104. 1 |AF023 1 04 [2547 1 60] 

gi|2547 1 59|gb| AF023 1 03 . 1 1 AF023 1 03 [2547 159] 

gi|2547158|gblAF023102.1|AF023102 [2547158] 

gi|2547157|gb|AFO23101.11AF023101 [2547157] 

gi|2415383|gb|AF015775.1|AF015775 [2415383] 

gi|2388636|gb|U94356.1|EFU94356 [2388636] 

gi|2388634|gb|U94355.1|ECU94355 [2388634] 

gi|2340825|dbj|D26O45. 1 (D26045 [2340825] 

gi|2226147|emb[Y14080.1|BSY14080 [2226147] 

gi|2327026|gb|U87997. 1 (EFU87997 [2327026] 

gi|2318058|gb|AF012532.1|AF012532 [2318058] 

gi|1848175|emb|X87189.1|EM23S5SSP [1848175] 

gi|1848174|emb|X87187.1|EM16S23SS [1848174] 

gi|1848173|emb|X87188.1|EM16S23SP [1848173] 

gi|1848172|emb|X87185.1|EH23S5SSP [1848172] 

gi|1848171|emb|X87184.1|EH16S23SS [1848171] 

gi| 1 848 1 70|emb|X87 181.1 |EF23 S5SSP [ 1 848 1 70] 

gi| 1 848 1 69|emb|X87 1 83. 1 |EF23S5SPA [ 1 848 1 69] 

gi| 1 848 168|emb|X87 191.1 |EF23S5SAC [ 1 848 1 68] 

gi|1848167|emb|X87180.1|EF16S23SS [1848167] 

gi|1848166|emb|X87182.1|EF16S23SP [1848166] 

gi|1848165|emb|X87190.1|EF16S23SC [1848165] 

gi|1848164|emb|X87186.1|EF16S23SA [1848164] 

gi|1848156|emb|X87179.1|ED23S5SSP [1848156] 

gi| 1 848 1 55|emb|X87 1 78. 1 |ED 16S23SS [ 1 848 1 55] 

gi|1848154|emb|X87177.1|ED16S23SA [1848154] 

gi|2274942|emb|AJ000346. 1 (EHNAPBC [2274942] 

gi|2274939|emb|AJ000042. 1 |EFGLS24B 
[2274939] 

gi|414575|gb|L12710.1|ENEAAC [414575] 
gi|2245603|gb|AF006008. 1 |AFO060O8 [2245603] 



gi|223 1 992|gb|U94530. 1 |EFU94530 [223 1 992] 

gi|2231990|gb|U94529.1|EFU94529 [2231990] 

gi|2231988|gb|U94528.1|EFU94528 [2231988] 

gi|2231986|gb|U94527.1|EFU94527 [2231986] 

gi|223 1984|gb|U94526. 1 |EFU94526 [223 1984] 

gi|2231982|gb|U94525.1|ECU94525 [2231982] 

gi|223 1980|gb|U94524. 1 |ECU94524 [223 1 980] 

gi|2231978|gb|U94523. 1 |ECU94523 [223 1978] 

gi|223 1 976|gb|U94522. 1 |ECU94522 [223 1976] 

gi|2231974|gb|U9452 1 . 1 |ECU9452 1 [223 1974] 

gi|2196685|gb|U25090.1|EFU25090 [2196685] 

gi|2197120|gb|AF003922. 1 |AF003922 [2 197120] 

gi|2 1 96683|gb|U25095. 1 (EFU25095 [2 1 96683] 

gi|2 1 9668 1 |gb|U25094. 1 |EFU25094 [2 1 9668 1 ] 

gi|2 1 96679|gb|U25093 . 1 |EFU25093 [2 1 96679] 

gi|2196677|gb|U25092. 1 (EFU25092 [2 196677] 

gi|2196675|gb|U25091.1|EFU25091 [2196675] 

gi|2196673|gb|U24682.1|EFU24682 [2196673] 

gi|532533|gb|U09422.1|EFU09422 [532533] 

gi|487271|dbj|D17462.1|ENENTP [487271] 

gi|468459|dbj|D28859.1|ENEPPDl [468459] 

gi|440135|dbj|D16334.1|ENEATPK [440135] 

gi|39 1680|dbj|D 138 1 6. 1 |ENENAABS [391680] 

gi| 1402524|dbj|D78257. 1 |D78257 [1402524] 

gi|709995|dbj|D30808. 1 |BACYCB20 [709995] 

gi|2 1 09265|gb|U9 1 527. 1 |EFU9 1527 [2 1 09265] 

gi|1041112|dbj|D78016.1|ENEPPDlA [1041112] 

gi|1339880|dbj|D85392.1 [ENERPA [1339880] 

gi|1339878|dbj|D85393.1|ENEGElE [1339878] 

gi|662918|emb|Z46807. 1 |EHCOPAYZ [662918] 

gi|769796|emb|X86176.1|EFRPODDNE [769796] 

gi| 1 854638|gb|U5 1 479. 1 |EGU5 1479 [ 1 854638] 

gi| 1 85722 1 |gb|U72706. 1 |EFU72706 [ 1 85722 1 ] 

gi| 1 8572 1 9|gb|U72704. 1 |EFU72704 [ 1 8572 1 9] 

gi| 1 8572 1 7|gb|U72705. 1 |ECU72705 [ 1 8572 1 7] 

gi| 1 272655|emb|X96978. 1 |EFPPD 1 GNS £1 272655] 

gi| 1 272652|emb|X96976. 1 |EFPLSEP i G [ 1 272652] 

gi| 1 279406|emb|X96977. 1 (EFPAD 1 ORF 
[1279406] 

gi|1070149|emb|X93211.1|EFTNFOl [1070149] 
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gi|I065723|emb|X92947.1|EFTETMGN [1065723] 

gi| 1 0 1 9639|gb|L38972. 1 |PH4COINJN [101 9639) 

gi| 1 1 5 1 1 5 1 |gb|U43087. 1 |EFU43087 [1151151] 

gi| 1 098507|gb|U 1 7283. 1 |BMU 1 7283 [ 1 098507] 

gij 1498072|gb[U64887. 1 (EFU64887 [ 1498072] 

gi| 1 49807 1 |gb|U64 886. 1 (EFU64886 [149807 1 ] 

gi|1469783|gb|U58049.1|EHU58049 [1469783] 

gi| 1 763666|gb|U81 452. 1 |EFU8 1 452 [ 1 763666] 

gi|624694|gb|L38973.1|PH4SEQ [624694] 

gi|1730458|cmb|Z83305.1|EFVANRES [1730458] 

gi| 1 4 1 9498|emb|X84796. 1 |ECPFW4 [141 9498] 

gi| 1 4 19497|emb|X84795. 1 |ECPFW3 [141 9497] 

gi|1419496|emb|X84794.1|ECPFWl [1419496] 

gi|254400|gb|S43266. 1 |S43266 [254400] 

gi|239025|gb|S66277.1|S66277 [239025] 

gi| 1 05493 1 |gb|U38590. 1 (EFU38590 [ 1 05493 i ] 

gi|1244573|gb|U39788.1|EHU39788 [1244573] 

gi|1244571|gb|U39789.1|EGU39789 [1244571] 

gi|1244569|gb|U39790.1|EFU39790 [1244569] 

gi|1255020|gb|U39777.1|ESU39777 [1255020] 

gi|1255018|gb|U39775.1|EPU39775 [1255018] 

gi|1255016|gb|U39778.1|EDU39778 [1255016] 

gi|1255014|gb|U39776.1|ECU39776 [1255014] 

gi|1255012|gb|U39774.1|EAU39774 [1255012] 

gi| 1 6 1 9922|gb|U69267. 1 |IVU69267 [161 9922] 

gi|790436|emb|X8486 1 . 1 (EFEFMPBP5 [790436] 

gi|790434|emb|X84858. 1 |EFD63RPSR [790434] 

gi|790432|emb|X84862.1|EF721PBP5 [790432] 

gi|790430|emb|X84860. 1 [EF63RPBP5 [790430] 

gi|790428|emb|X84859. 1 |EF366PBP5 [790428] 

gi| 1 572800|gb|U70854. 1 (CELF38A5 [ 1 572800] 

gi|1041816|gb|U17153.1|EFU17153 [1041816] 

gi|1086523|gb|U39859.1|EFU39859 [1086523] 

gi|403564|gb|U01917. 1 |EFU01 9 1 7 [403564] 

gi| 1 5 1 5474|gb|U66286. 1 |EFU66286 (151 5474] 

gi| 1 5 1 3068|gb|U 1 5554. 1 |LMU 1 5554 [ 1 5 1 3068] 

gi| 1 296520|emb|X94 181.1 |EFENTAORF 
[1296520] 

gi| 1488069|gb|U63997. 1 |EFU63997 [1488069] 
gi| 1 209525|gb|U35369. 1 |EFU35369 [ 1 209525] 



gi| 1 46934 1 |gb|U3093 1 . 1 |ESU3093 1 [ 146934 1 ] 
gi|488331|gb|M77276.1|SYNGIP2122 [488331] 
gi|1046177|gb|U39733.1| [1046177] 
gi|1236613|gb|U49939.1|CVU49939 [1236613] 
gi|47491|emb|X55766.1|SS16SR5G [47491] 
gi|47490|emb|X55767. 1 |SS 1 6SR3G [47490] 
gi|4706 1 |emb|X56353. 1 |SFTET9 1 6 [4706 1 ] 
gi|49022|emb|X62755.1|SFNPRG [49022] 
gi|47047|emb|X17214.1|SFPASAl [47047] 
gi|47044|emb|X68847.1|SFNOXAA [47044] 
gi|47033|emb|V01547.t|SFKANR [47033] 
gi|47018|emb|X02027. 1 [SF5SRNA [47018] 
gi|51 1044|emb|X75752.1|MP16SRNA0 [51 1044] 
gi|5 1 1 043|emb|X7575 1 . 1 |MP 1 6SR243 [5 1 1043] 
gi|88648 1 |emb|X828 1 9. 1 JESPLPAM [88648 1 ] 
gi|517387|emb|X76177.1|ES16SRR [517387] 
gi|472916|emb|X76913.1|EHNTPOP [472916] 
gi|4335 1 |emb|X55 1 33. 1 |ES 1 6SRRN [4335 1 ] 
gi|1143442|emb|X92687.1|EFPBP5G [1143442] 
gi|963032|cmb|Z50854. 1 |EHARPQTOU [963032] 
gi|886479|emb|X848 1 8. 1 |EHDNAPSR [886479] 
gi|551437|emb|X8 1 654. 1 |EHIS 1216 [55 1437] 
gi|467805|erab|X78425. 1 |EFPBP5 [467805] 
gi|29672 1 |emb|X5596 1 . 1 (EFPD78 [29672 1 ] 
gi|287946|emb|Z19137.1|EFPTSHGN [287946] 
gi|49042|emb|X63285.1|EHNAKA [49042] 
gi|49019|emb|X62658.1|EFSEAl [49019] 
gi|43337|emb|Z12296.1|EFSPREG [43337] 
gi|43335|emb|X56895.1|EFPVANAG [43335] 
gi|43333|embpC16421.1|EFPF54 [43333] 
gi|43331|emb|X62657.1|EFORF3 [43331] 
gi| 1065721 |emb|X92945. 1 (EFCAT501 [1065721] 
gi|80655 1 |emb|Z49243. 1 |EF4 1 1 OSOD [80655 1 ] 
gi|806549|emb|Z49244. 1 |EF4 1 05SOD [806549] 
gi|505530|emb|X79542.1|EFAS48 [505530] 
gi|43323|emb|X62656.1|EFASPh[43322t 
gi|40840|emb|X56422. 1 |EC 1 6SRNAG [40840] 
gi|48 189|cmb|X04388. 1|TN1 545TR [481 89] 
gi|9288 1 4|gb|L4084 1 . 1 [ENETRANSPO [9288 1 4] 
gi|141856|gb|L01794.1|ADlREPABC [141856] 
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gi|149125|gb|M90647.1|IP8VANY [149125] 
gi| 1 4 1 862|gb|M87836. 1 1 AD 1 TRAE 1 [ 1 4 1 862] 
gi| 1 4 1 860|gb|M 843 74 . 1 1 AD 1 TRAA [ 1 4 1 860] 
gi|141853|gb|M62888.1|ADlPADl [141853] 
gi| 1 1 0 1 637|dbj|D3 1 674. 1 |EVM 1 6RNA7 [1101 637] 
gi|l 101636|dbj|D31675.1|ENE16RNA8 [1 101636] 
gi|497792|dbj|D3 1676. 1|ENC16RNA9 [497792] 
gi|1022729|gb|U36195.1|EFU36195 [1022729] 
gi|488338|gb|M77279.1|SYNGIP3 124 [488338] 
gi|488335|gb|M77278. 1 |SYNGIP2563 [488335] 
gi|488333|gb|M77277. 1 |SYNGIP2 124 [488333] 
gi|488329|gb|M77275.1|SYNGIP2121 [488329] 
gi|388267|gb|L19532.1|ADlTRAC [388267] 
gi|4930 16|gb|U03756. 1 (EFU03756 [4930 1 6] 
gi|453536|gb|L28754.1|INSTRAN [453536] 
gi| 1 53658|gb|M58002. 1 |STRHYDROLA [ 1 53658] 
gi|475427|gb|U0068 1 . 1 |EFU0068 1 [475427] 
gi|8 1 8704|gb|U24692. 1 |EFU24692 [8 18704] 
gi|l55036|gb|M97297.1|TRNVAN [155036] 
gi| 1 50552|gb|M64978. 1 |PCFPRGAB [1 50552] 
gi|786274|gb|U2254 1 . 1 |EHU2254 1 [786274] 
gi|786273|gb|U22540. 1 [EHU22540 [786273] 
gi|559858|gb|L371 10.1|AD1CLYL [559858] 
gi|6436 1 4|gb|U 1 6659 . 1 (ECU 1 6659 [643614] 
gi|6436 1 2|gb|U 1 6658 . 1 [ECU 1 6658 [643612] 
gi|29064 1 |gb|L13292. 1 |ENECOPPUMP [29064 1 ] 
gi|624701 |gb|L29639. 1 |ENEVANCRF [624701] 
gi|624699|gb|L29638.1|ENEVANCR [624699] 
gi|624692|gb|L2964 1 . 1 (ENEDDLA [624692] 
gi|624690|gb|L29640. 1 |ENEDDL [624690] 
gi|493094|gb|L32813.1|ENERRD [493094] 



gi|153852|gb|AH000939.1|SEG_STRTN916 
[153852] 

gi| 153851 |gb|M22645. 1 JSTRTN9 1 62 [ 1 53 85 1 ] 
gi|153850|gb|M20864.1|STRTN9161 [153850] 
gi|153660|gb|M36878.1|STRIF2BA [153660] 
gi| 1 53585|gb|M 1 377 1 . 1 |STRBRP [ 1 53 585] 
gi|153575|gb|M64265.1|STRATPEFHA [153575] 
gi| 1 53565|gb|M90060. 1 [STRATPASEA [ 1 53565] 
gi| 1 52969|gb|M92376. 1 |STABLAIA [ 1 52969] 
gi|309660|gb|L14285.1 [PCFPRGWZY [309660] 
gi|433714|gb|L12033.1|ENESATA [433714] 
gi|290645|gb|L15304.1|ENEVANB2A [290645] 
gi|148331|gb|M84146.1|ENEVANR [148331] 
gi|148329|gb|M64304.1|ENEVANH [148329] 
gi| 148326|gb|M689 10. 1 |ENEVANCRES [148326] 
gi| 1 48324|gb|M75 1 32. 1 |ENEVANC [ 1 48324] 
gi|148323|gb|L06138.1|ENEVANB [148323] 
gi|148321|gb|M85225.1|ENETETM [148321] 
gi|148320|gb|L00925.1|ENERTRNA [148320] 
gi| 1483 19|gb|L00924. 1 (ENERRNA [1483 1 9] 
gi| 1 483 1 7|gb|M8 1466. 1 |ENERECA [ 1483 1 7] 
gi|148315|gb|M81961.1|ENENAPA [148315] 
gi|148312|gb|M38386.1|ENEMSPDPS [148312] 
gi|148310|gb|M37185.1|ENEGELE [148310] 
gi| 1 48307|gb|L07892. 1 |ENEBLACREG [ 1 48307] 
gi| 148305|gb|M60253. 1 |ENEBELAA [148305] 
gi| 148303|gb|M77639. 1 |ENEB14NAM [ 148303] 
gi|290644|gb|L16515.1|ENERGTG [290644] 
gi| 1 54954|gb|M37 1 84. 1 |TRN9 1 6 [ 1 54954] 
gi| 1 48301 |gb|M6922 1 . 1 |ENEAAD9A [14830 1 ] 
gi|148308|gb|M38052.1|ENECYLB [148308] 
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Table 28 

Phage Dpi complete genome sequence. 56506 nucleotides. 



1 ataataaaaa tatgaagcag atattgggtt aattattgct taacaaaatg caccgaattt gtgtataata 

71 taagtgaagc agttttgtaa acctgacatc ctgctaaata aaaataaagg aggctcgaac atgagtcaaa 

141 acactacacg cactgacgct gaactgacag gcgttactct tttaggaaac caagacacca aatacgatta 

211 tgactataat ccagacgtcc ttgaaacttt ccctaacaaa catcctgaaa ataattacct agtaacattt 

281 gacggatatg aattcacttc cctttgccct aaaacaggac agcctgactt cgcgaatgtt ttcattagtt 

351 acattccaaa cgaaaagatg gttgaatcta aatcattgaa attgtactta ttcagtttcc gtaaccacgg 

421 cgacttccac gaagattgca tgaacattat tttgaatgac ttgtatgaat tgatggaacc taagtacatt 

491 gaagtcatgg gcctattcac tcctcgtggt ggaatttcaa tttacccatt cgtcaacaaa gtgaatcctc 

561 aatttgcaac tcctgaactt gaacagcttc aacttcaacg caaattgaac ttccttggaa atgttcaagg 

631 tcttggacga gctattcgat aggaggctgg aatgaaatca gtagttttat tatccggcgg agtcgactca 

701 gccacttgtt tagcaattga agttgacaag tggggttcta aaaatgttca tgctatagca ttcaattacg 

771 gacaaaagca tgaagcagaa cttgaaaatg ctgctaatgt tgcaatgttc tacggagtca agttcaccat 

841 tcttgaaatt gactcgaaaa tctactcaag ctctagctct tccttattac aaggaaaagg cgaaatttca 

911 catggaaaat cttacgctga aatcctagca gagaaggaag tagttgacac ctatgttcca tttagaaatg 

981 gactaatgct ttcacaggct gcggcttatg cttattcggt tggagcttct tacgtcgtat atggtgctca 

1051 cgcagacgat gcggctggag gtgcttaccc tgattgcact cctgagttct ataattcaat gtcaaatgca 

1121 atggaatatg gaactggagg caaggtaacc cttgtcgctc ctctacttac tctaaccaag gcgcaagtcg 

1191 ttaaatgggg aattgattta gatgttcctt atttcttgac tcgttcatgt tatgaaagtg acgctgaaag 

1261 ttgtggaact tgcgcaactt gtatcgaccg caaaaaggca ttcgaagaaa atggaatgac tgaccctatt 

1331 cattataagg agaattgata tgagagtttc taaaacctta acattcgacg cagctcatca actagttgga 

1401 cattttggaa aatgcgcaaa tttgcacggg catacttaca aagtcgaaat ttcattagca ggcggaactt 

1471 atgaccacgg ttcgagtcaa gggatggttg ttgactttta tcacgtcaag aaaatcgcag gtacattcat 

1541 tgacagactt gaccacgctg ttcttcttca agggaatgaa ccaatcgctt tagcaaatgc agttgacacc 

\ 1611 aagcgagttc tatttggatt tagaactacg gctgagaata tgtcaagatt ccttacctgg actctcacgg 

1681 agcttatgtg gaagcatgct cgtatcgact ctatcaaact atgggaaact cctacaggtt gcgcagaatg 

1751 tacttactac gagattttca cagaagacga gattgaaatg ttcaagaacg taacctttat cgacaaagac 

1821 gaaaagatta ctgtccgcga aattttagag caggagcagg ataatggtta atcaatacaa tcagcctgaa 

1891 agaggcaaga ttcgaatcaa tgttcgcgac cctgagaaaa tgcctatcat ggaaattttc ggtcctacaa 

1961 ttcaaggtga aggaatggtt ataggtcaaa agactatttt cattcgaact ggtggatgcg act at cat tg 

2031 caactggtgt gactcagcct ttacctggaa cggtactact gagccggaat atatcacagg caaagaagct 

2101 gctagtcgaa tcttgaaact agctttcaat gataaaggtg aacagatttg taaccacgtg acattgactg 

2171 gaggaaatcc tgccttaatc aacgagccta tggctaagat gatttcgatt ctaaaagaac atggattcaa 

2241 gtttggtctc gaaactcaag gaactcgatt ccaagaatgg ttcaaagaag taagcgatat cactattagt 

2311 cctaaaccgc cttcaagtgg aatgagaact aatatgaaaa ttcttgaagc tattgtagat agaatgaatg 

2381 atgaaaacct tgactggtca tttaaaatcg ttatctttga cgaaaatgac ctagcttatg cgcgtgatat 

2451 gtttaaaact ttcgaaggca agttacgtcc agtgaactac ctttcagttg ggaatgcaaa cgcatacgaa 

2521 gaaggaaaaa tcagtgatag gcttcttgaa aagttgggat ggctttggga taaagtgtat gaagacccag 

2591 ctttcaacaa tgttcgacct ttaccgcaac ttcatacact tgtttatgat aataaaagag gagtataaaa 

2661 tgaaaattga gcatctagat aaaatcggta acgtattagg gagagagaac ggatgggctt cccttaagcc 

2731 ggatgaaatt gtaaccttgg acaatactga ggcagccgtt caaagacttt ttggtctatt aggcgaggac 

2801 gcagaacgtg acgggttgca agatactcca ttccgttttg ttaaagcact cgctgaacat accgtagggt 

2871 atcgagaaga ccctaaactt catctcgaaa aaacattcga cgtcgaccat gaagaccttg ttcttgtgaa 

2941 agacattcca ttcaattctt tatgtgagca tcatttagct ccgttcgtag ggaaggtgca tattgcatac 

3011 attcctaagg ataagattac aggtctttca aaattcggtc gagtggttga aggatacgct aaacgacttc 

3081 aagtacaaga gcgcttgact caacaaatcg ctgacgctat tcaggaagtt ctaaatcctc aagcagttgc 

3151 ggtcatcgta gaggctgagc atacttgcat gagcggacgc ggtattaaga agcacggggc aacgacagtg 

3221 acttcaacta tgcgaggtct tttccaagat gacgcatctg ctcgagcaga attgcttcag ttgattaaaa 

3291 agtaggaggc ggaaaatgaa taaaagtgca accttttggc ttgttcgaac agctcttatt gcggctctat 

3361 atgtgacatt gaccgttgca ttttctgcta ttagttatgg acctattcaa tttagagtca gtgaagcctt 

3431 gattcttcta cctttatgga accatagatg gactccgggg attgtattag gaacaattat tgcaaacttc 

3501 ttttcacctc ttggactgat tgacgtttta ttcggttcac ttgctacctt ccttggagta gtggcaatgg 

3571 tgaaagttgc taagatggca agtcctctat attcacttat ctgtccagtt cttgctaatg cttaccttat 

3641 tgcgctggaa cttcgaatag tttactcttt acctttttgg gaatctgtca tctatgtagg aattagtgaa 

3711 gcgattatcg ttttaatttc atacttcctt atttccacgc tggcgaagaa caatcatttt agaacactga 

3781 taggagcgaa aaatgggatt taatctatac ttcgcaggag gtcacgctat tagcactgac gattatttga 

3851 aggaaagagg agccaatcgc ctattcaatc aactgtacga aagaaacggg attggcaaaa ggtggattga 

3921 gcataagaaa accaatccaa gcactacttc aaaactattc gtcgactcta gtgcatattc tgctcatacc 

3991 aaaggggctg aagttgacat tgacgcctat atcgaatacg tgaatgataa cgtgggaatg tttgactgta 

4061 tcgccgaact cgataaaatt cctggtgtat ttagacagcc taagacacgt gaacagcttt tggaagcacc 

4131 acaaatttct tgggataatt atctatacat gcgcgagcga atggttgaga aagacaagct cttacctatt 

4201 ttccatatgg gagaagactt taaatggctc aacttgatgc tcgaaactac attcgaaggc ggaaagcata" 

4271 ttccttacat tggaatttca ccagccaatg actcgactac gaagcataaa gacaagtgga tggaaagagt 

4341 attcgaagtt attcgaaaca gttctaatcc agacgttaag actcacgcat ttgggatgac agttactagc 

4411 caattagagc gtcacccatt ctatagcgcc gactctactt ctgtactgct cacaggagcg atgggaaaca 

4481 ttatgacgtc aaaaggatta gctgacttgt cacagaagaa tggaggaatt gatgctgtcc gtaggctgcc 

4551 aaaaccggtt caagttgaaa ttgaatccat tatcgaagaa actggagcgc attttagcct agagcaatta 

4621 gctgaggact ataaacttcg agcattgttc aatgttcaat acatgctgaa ttgggcagag aactatgaat 

4691 tcaagggaat taaaaatcgt caacgtcgac tattttagat aagagctttt cgctcttatt ttttttaaaa 
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4761 aaaaacgaac tttttataca aaaacgcttg 

4831 aacgaataag aggtaaataa aatgacagca 

4901 actttctaaa agatgttgag tacagtgaca 

4971 tggcgctcat agagatgagc acgatatgaa 

5041 tgaaaaaagc tcaaacttat caagaatatc 

5111 tcgagaagga aaaataggag tcgatgaagc 

5181 gaggaacctc ctttcattgt actcaaaatg 

5251 atatgcttaa aagatttaaa attatttaga 

5321 atatcaaaaa aaggaggctc atattatgag 

5391 cagctcaata agttgaagcc tagcaagttg 

5461 aatgcgtcat gtttacagcg tatgatggct 

5531 tgacgtgatt gtgaaagcag agcagtttgg 

5601 gttcctgaag aatcttcgct aaaagttatt 

5671 aagagcaccc tacattcgac cacttgctcg 

5741 gctgttctac ggaatcgcca atatcaacga 

5811 ggcttcctgt taaaaggcgg aaaagcaatt 

5881 aaaagggact agaaatgctc attccttaca 

5951 gtacttctgg caaattgacg atactactgt 

6021 atggaaggta tggaagatta tgaagacgtt 

6091 cccctacagc agaaatcctg agcgtattag 

6161 cgtcgaatcc ttattcttga aagaccgact 

6231 cacgcatctg ctggcaagaa agtttcgaag 

6301 aaactgtacc aaccgtcacc gaagaaaact 

6371 atcgaatggc gtcgtttact tcctagcact 

6441 gaactgcaaa gatggttaga gcaggaaaca 

6511 ggttattgaa cgaactcagc ctgaatataa 

6561 attcgaaaaa tgtatttcga aagaatcggt 

6651 tgggcgaagc tggaacattt aggcacgaag 

6721 ggactttgaa tggttgaatg tagcagagtt 

6791 cgtttcaaga aaaacgatta tgaaacgaag 

6861 gactagttcg atataaaggc aagctctaca 

6931 acatactgag ccctatgaag aacacaagat 

7001 gtcattttcc tttatgaaaa tcgagataac 

7071 tgaaaaatca agtccttgga aaaattatga 

7141 ctattgctct tcagcctatt gcccatattg 

7211 tgttcgagga agactttttc gaaggtgcaa 

7281 taccactaat ggatttcgag gagttgcaaa 

7351 tttattgaac tgaaaactac taaagaagct 

7421 agctatcacg cgcagatgga tgcaaattta 

7491 gattatatgg tatccaattt caagccttga 

7561 ttcatcgatg cagggtatga agtttcttac 

7631 ttctagatgc agttgagctt cattacaagg 

7701 ttcgagaaga agaaatacga gatgctcaag 

7771 cgacgaaatt gttgaagcag cttgcggttc 

7841 caaaatcctg tcattatgga agaccttaac 

7911 cagatagggc ggaaatggtg ggaatacaaa 

7981 tctatacatt ttagccgccg ggaaaactat 

8051 gaagaagtca tcgaaaatgc ttacaagcga 

8121 aggtattagc atctttaaaa cgaattcaaa 

8191 aaaaggagta ttattaaatg caaaaagacg 

8261 atacacaggt gattgggttg atgtacgaat 

8331 tcaagatgtc gaaaagtgct tcaaaaggct 

8401 cacacggatt tgctcttgaa cttcctaagg 

8471 gaaaactggt ctaatcttcg tttctagcgg 

8541 ttctcagctt ggtatgctac tcgtgacgca 

8611 aggaaaagca acctgctatc aagttcaatt 

8681 aagcacaggt gatttctaat gaaattggaa 

8751 tagcagttca aggacttgaa cgtgaagcgc 

8821 aacctacggc gggctccctc gaaaaagggt 

8891 tcagctctcg acattgtcaa gaatgcgcaa 

8961 tcaaggaaaa gctggaaaat gcgcgtgcat 

9031 actcgatagt cttcaagagc ctcttaagat 

9101 gctaaaaaga ttggagtcga tgttgacaat 

9171 tacttcaata tgttttagac attttcgaaa 

9241 catggtcagt caaaacctta ttgatgaaga 

9311 actgaattta gtcgaaaggt tactcctctt 

9381 ttcgagaaga tatgaatagt cagtacaatg 

9451 tgcagttcga cttaaattta gaaaaggtga 

9521 cgaaaccctg cagggaatgt agtagagtca 

9591 tagtttccta tacgctttcc tatcatgatg 

9661 atttggagtc attcaaaagg caggggcatg 

9731 gatgaagacg aagaaccatt gaagttccaa 

9801 acttattcga catggtgatg actgcggttc 

9871 ctatttggac ctaagctagt gcctgctagt 

9941 aaatcgatga gcaagtggtt gagcttatga 

10011 ttattatttt aatgactcaa ttatagcaga 
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actttattca" ctcattatcg tataatcata atataaataa 
gttcaacaag ttaagttcta cttagaagaa gccggcgctc 
acttagagca agcaattatg aaagatattc ttaaatggaa 
aataacttca tacgaagtat tatagagagg ggtaaggcta 
taaaactagt tgagttcaaa cgtcaacttt ctttaaatct 
ggttattcaa ttattcacct tctatagttt caacaatatc 
caagaggctg ccgtgaacgg gacttatgaa gcaaaactca 
aacggcttta caaactcgcg ataattcgtg tatattatat 
tattaagttc aaaaccgaag aactttcaaa aattgtttct 
ctagaaatca caaactattg gcatattttt ggtgacggcg 
caaacttcct tcgatgcatt atcgacagcg atgttgaaat 
aaaacttgta gaaaagacca cggccgcaac cgtcacatta 
gggaatggtg agtacaatat tgatattgtt acagaagatg 
aagacgtgag tgaagaaaat gctctcactt tgaaaagctc 
ttctgcggta tctaaatcag gagcagatgg aatttatacc 
actacagaca tcattcgcgt atgtatcaac cctatcaagg 
acctaatgag tattttagca agtattcctg atgagaagat 
ctatatttca tcggcttcag tcgaaattta tggaaaattg 
tcacagcttg actcaattga gtttgaagat gatgcggcta 
accgccttgt actattcact tcagcctttg acaaaggaac 
tcgaattaaa acttctacta gcagttatga agacatcatg 
aaagaattca cttgccacct taacagctca ctcttgaagg 
tcactgtctc ttatggaagc gaaaccgcaa ttaagatttc 
tcaagagccg gaagaataat ggccaagtcc aatttaacta 
gtgaaggtcc tgcttcatct tttgtcaatt cgctgacccg 
tccttcgaca tattataagc ccagcggggt tggtggatgt 
gagtctatta tagataacgc agattctaac ctaattgcaa 
ttctccaaga gtacatggtt aaaatggctg aaatcgatga 
cttgaaagaa aatccagttg aaggaactat cgtcgacgag 
tgtaagaacg aacttcttca actttcattc ttgtgtgacg 
ttttagagat taagactgaa accatgttca agttcactaa 
gcaagcaact tgctacggaa tgtgtctagg agtcgatgat 
ttcgaaaaga aagcctacac gtttcacatc acagacgaga 
cctgcgaaga gtatgtagag aaaggcgaaa gtcctaaaat 
tagaaaggaa ggtcgaaatc tgtgagctat actggaaaaa 
aagactttga gaaagatgct ttcacggtcc gtctatatga 
tccctgcgat tatatagccg caactaactt tgggaccttg 
tctttgagct ttaataacat cactgataat caatggttcc 
ttctcgccgg aattttagtg tatttccaaa agcatgaaaa 
aaaaattaaa cggtctggag ttaaaagcgt caacccaaac 
aagaagcgtc gaactagatt gaccattcct ttccaaaatg 
agaaaagcaa tggcaagacc taagttacct caaattgata 
acgtagcaga ctcgtatggt gcgattatca ataaagtagt 
acttgaccag gcaatggaag aaattcaaat agttgtaagc 
tactacattg gctatcttcc cactcttctt tatttcgccg 
tggattcaag ttctgctatc aggaaagaaa aatacgataa 
tcctgacaag caagcagaaa ctcgaaaact tgtcatgaat 
gcctacaaga aagttcaatt aaagctagaa caggccgata 
cctggcaact agcagagtta gaaactcagt caaataattc 
tagacgtgaa aatgattgac cctaaacttg accgattaaa 
tagttctatc actaaaattg acgccgacag cgccgatgtc 
caagtatatt cagtggcggc aggtgaatgc attaaaattg 
gatatgaagc aatcttgcat cctcgttcca gtctttttaa 
agtgattgac gaaggttaca aaggtgacac tgatgaatgg 
gatatcttct acgaccaaag aattgcccaa tttagaattc 
tcgtagaatc tttaggaaat gcggctcgtg gaggccatgg 
cagttgatga aggactggaa taaggattcg aaagctcttg 
ttccaagaat ccctttttct gcgccttcta tgaattatca 
agttgaattc ttcggtcctg agtcaagtgg gaaaactact 
atggtatttg agcaggaatg ggaacagaag actgaagaac 
ccaaagctag caagactgct gtcaaggaac ttgaaatgca 
tgtatatctt gaccttgaga atacattaga cactgagtgg 
atttggatag ttcgccctga aatgaacagc gctgaagaaa 
caggtgaagt tggcctagta gttctagatt ccttgcctta 
gttgactaaa aaggcctatg caggaatctc agcgcctttg 
cttactcgct acaatgcaat attcctaggc atcaatcaaa 
cctattcaac tccaggcgga aagatgtgga agcatgcttg 
ctaccttgac gaaaacggtg catcattgac ccgtactgct 
ttcgtcgaga agaccaaagc atttaagccg gacagaaaat_ 
gaattcaaat tgaaaatgac cttgtagatg tcgctgfccga 
gttcagtatc gtcgaccttg aaactggaga aattatgaca 
ggcaaggcaa atctagttcg acgcttcaag gaggatgact 
acgaaattat cactcgagaa gaaggctaat gcaaaaatct 
tcaaggcgca agaaaagaac ggttccaaaa cctaaaccta 
accgcagaga gcgtcaagtg cttgttcata gttgcatcta 
cgggcagtat gacaaatgga gccacgaact atattctctt 
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10081 acagtttcgc accctgatga gtttcgacag actgttctct ataacgagtt taaacagttt gacggaaata 

10151 ctggaatggg tcttccatac gactgtcagt ttgctgtaag ggtcgcagaa aggcttttaa gaaaatgaat 

10221 ttagcttcta aataccgtcc tcaaactctc gaggaagtgg tagctcaaga acatgtcaaa gaaattcttt 

10291 cgaaccaatt acaaaatggc gctatcaaac acggctatct attctgtggt ggcgctggaa ctggcaaaac 

10361 cactactgct cgaattttcg cgaaggatgt gaacaaagga cttggctctc ctattgaaat tgatgctgct 

10431 tctaataatg gggtagaaaa tgttcgaaac attattgaag attctagata caagtctatg gacagcgagt 

10501 tcaaagttta catcattgac gaggttcata tgctttcaac cggagcattt aatgcgctgt tgaaaacatt 

10571 agaagagccc tcatcgggaa ccgtgttcat tctatgtact actgaccctc aaaagactcc tgacactatt 

10641 ctcagtcgag ttcaacggtt tgactttact cgaattgata atgacgacat cgttaatcaa cttcaattta 

10711 ttatcgaaag tgaaaatgaa gaaggagctg gttatagtta tgagcgtgac gccctttcgt ttattgggaa 

10781 acttgcaaat ggaggaatgc gtgacagcat cacaaggctc gaaaaagtcc ttgattatag tcatcacgtt 

10851 gacatggaag ccgtttctaa tgcactagga gttccggact acgaaacatt cgcttcactt gttgaagcta 

10921 ttgccaacta tgacggctca aagtgtttag aaattgtaaa tgacttccac tactcaggaa aagacttgaa 

10991 attagtgact cgaaacttta cagacCtcct tttagaggtt tgtaagtatt ggctagttcg agatatttca 

11061 atcactcaac ttcctgctca ttttgaaagt aagctagagc aattctgtga ggcttttcaa tatcctactc 

11131 tattgtggat gctagaagaa atgaacgaac ttgctggagt tgttaaatgg gagcctaatg ctaaaccgat 

11201 aattgaaacc aaacttcttt tgatgagcaa ggaggagtga catgattgga cagggacttg ttaaatctac 

11271 cattccgaaa tggaaacaac ttccaaaata tataatcgtc gaaggtgaag taggttcagg acggaagacc 

11341 ttaatccgtt atattgcttc gaaattcgac gctgattcta ttgtagtagg aacgagtgta gatgacattc 

11411 gaaacatcat tcaggatgca cagaccattt tcaaggcgag aatctacgtg atagacggaa atagcctgtc 

11481 aatgtcagct cttaactcgc ttttgaagat agcggaagag ccacctttaa actgtcatat agccatgact 

11551 gttgatagca tcaataatgc tttacctacg cttgcaagta gagcaaaagt tctaaccatg ctaccttata 

11621 ctaatgaaga gaaaatgcag tttgtcaagt cctacaagaa ggtagatact tcaggaattg acgaccgagc 

11691 gattgtagac tattgcaatc ttgccagcaa tcttcaaatg cttgaagaca tattagaata tggcgcagaa 

11761 gagctatttg aaaaggttac aacattttat gacttaatat gggaggcaag tgctagcaat tcgctaaagg 

11831 ttactaattg gctcaaattt aaggaaactg atgaaggaaa aattgagcct aaacttttcc tcaactgtct 

11901 tttaaattgg tcgacagttg tcatcaggaa gcactatgta gaaatgtctt tcgaagaact tgaggcccat 

11971 gaccttttag tgagggaagc atctaggtgt ttgcgaaagg tatctaaaaa gggctcaaat gcgcgtgtct 

12041 gcgtgaacga atttatcagg agggtcaaac aagttgagtg atttagtatc atttcaaaaa gacattcgaa 

12111 ccaataatct aaagccgttc tatatcttgt acggcgaaga aattggtctt atgaatgttt atctcaatca 

12181 aatgggaaat gtagttcgag aaacttcggt ttcaacagtc tggaaaaccc tcactcaaaa agggctcgtt 

12251 tctaaccatc gaatattcgc tgttcgagat gataaggagt ttctgtctaa tgagtcgagg tggaaaaggc 

12321 ttccggatgt tagatatggg acactcgttt tgatggttac taaaattgac aagcgaagca agttgctaaa 

12391 ggcctttcct gataattgtg ttgagtttga gaaaatgact gacgcgcagt tgaaaaggca ttttgtgtct 

12461 aaatactcga ctattgatag cgacatgatt gacatggtta tccagttctg tctaaacgat tactctagaa 

12531 ttgacaatga attggacaag ctgtcgcgat tgaaaaaggt tgacgcatca gtagttgaat ccattgtcaa 

12601 gcacaagacc gaaattgaca ttttcagcct agttgatgat gtattggaat ataggccgga gcaggcaatt 

12671 atgaaagtga ctgaactttt agccaaagga gaaagtccta ttggattgct taccttgctt tatcaaaatt 

12741 ttaataacgc ttgtcttgtg ctaggagccg atgagcctaa agaagccaat ctaggcatta agcagttctt 

12811 aatcaataag attgtctata actttcaata cgagctggac tcagcctttg aaggcatggc tattttaggt 

12881 caagctatcg agggcataaa gaatggtcgc tatacagaaa gttcagtggt ctatatttct ttgtataaaa 

12951 ttttttcact tacttaacaa ataagctgaa atctgtgtat attacagtat aagcaaagga ggacagccta 

13021 tgacagaagt tgcggtaaat agcccgcaaa aggtgagagt agttatggtc gggaatattg aatttctcga 

13091 atatttaaaa aggaagtacg gaacagaaac ttccatcagt tatattatag aaaatgaaag gggtctaata 

13161 tgacagactt taaaaaacgc ttcaagaaag cagtaacaga aacaatcaat cgtgacggta tcgagaacct 

13231 tatggattgg ctcgaaaatg ataccaattt cttctcaagt ccagcaagca ctcgatacca tggaagctat 

13301 gaaggtggac ttgtcgagca ctcattaaac gtgttcaatc aactactttt cgaaatggat accatggtag 

13371 gcaaaggctg ggaagacatt tacccaatgg aaacagttgc aatcgtagca ctatttcacg acctttgcaa 

13441 agttggtcag tatcgtgaaa ctgaaaaatg gcgcaagaac agcgacggtg aatgggaaag ctatttagca 

13511 tatgaatacg accctgagca acttacaatg ggacatggtg caaaatctaa tttccttctt caacgtttca 

13581 ttcaactcac gccagttgaa gcccaagcaa ttttctggca tatgggagcc tatgatatta gtccttatgc 

13651 aaatttgaat ggatgtggag cagccttcga aactaatcca cttgcattct taatccatcg cgcagatatg 

13721 gccgcaactt atgtagtcga aaatgaaaac ttcgaacact ctcaaggtcc agttgaacaa gaggctgagg 

13791 ttgaagaagt agttgaagaa aaacctaaga gttcaactcg taagaaacct gcgcctaagg aagaaaaagt 

13861 tgaagaggct gaagaaaaac caaaagctgg aatcactcga cgtcgcaaac ctgcgccaaa agaggaagag 

13931 gtagaagagc ctaaagaaga gcctaagaaa gcatcttcta aaattcgaat gcctaaaaag actgaaaagg 

14001 tcgaagaggt agaaagcgca gacgagccga aagttgaaga agcagaggac gacaatgtgg tggtacctgc 

14071 tggatatgtt cgagatgtct actacttcta cagtgaagtc gctgacgttt actacaagaa agatgtcgac 

14141 gagcctgacg atgacagcga cattcttgta gacgaagaag agtacatgga cgcaatgtgt cctgtattag 

14211 aagaagactt cttctacgaa cttgacggca aggttcacaa attagcaaaa ggtgaacgct tgccggaaga 

14281 atacgacgaa gaaacttggg aacctatcac tgaagcagaa tacatcaagc gaacagaaaa acctaaagca 

14351 gttgcaaaac ctactcgaaa aactccagcg ccttctcgtc gccctcgccc ttaaaagaaa ggttgaaata 

14421 aaatgtgtga aaattgtcaa aacgaaacat tcaatactag aattttcaat gaagatgaaa gtggctatgt 

14491 cgacgcctca ttcacttaca aggagattcg cgacaccgca gcagctatta gcaatcgagc ggtagaaaag 

14561 aaagaccgtg acagcctttt agtcgctaca gttatggctc ttcccgtttc tcacgcagaa gatttaggca 

14631 agagactttg tattgcaaat tctcgattgg aagcatttcg tgaagctgtt caagaggctc tcgagaatga 

14701 aaaggctgaa gatttaaagg acgttatctt aggtcttatc gacgttgaca aaaaaattgg caaccttgca 

14771 ttgcaattag ttgaatcagg agcattataa tggaacgaat aaagacgcta tttcacgtga tttatgctaa 

14841 cggcactcat ttagaagtag cagctttgtt cgataccgtt gatgattatg atgacgttat agaggacatc ~" 

14911 caggggtata ttgatacccc tgacctttat aatcaaagga gcattagaat ggcgccttac aatccfegaca 

14981 tcaatggtga cgctattgct actgacattt tactacgact agatgatatt atctacgtcg acgcaacttg 

15051 tgaaactatt aaatacgagg agcctattgc atgaacaatc agcgaaagca aatgaacaaa cgaatcgecg 

15121 aacttcgcga agactatcaa cgtgcaagag gtcgaataaa cttccttctt gctgtaaagg accacggcga 

15191 agaactcgaa aaccttgaag cctttgtggg atacattgac aatctagtcg aatgttttcc tgaaagccaa 

15261 cgaaatgtct tgaggctatg tgtattagat gaccttccag tcactaatgc ggccgctgaa attggatacc 

15331 actatacatg ggttcaccaa cttcgagaca aagcagttga aacacttgaa gaaattctag atggggataa 
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15401 cattattcgc tctaaacacg gaatcgaaat 

15471 agttagtgtc tttgtactgt cagccttttg 

15541 gaggaccacc gtagtaccgt cgcccttgta 

15611 tctttatcct cgcctatctg ccacgacatc 

15681 aaacaagaga aggcagttgc taagcagttg 

15751 acaaaggtga cgtcgtaaca gactcaatgc 

15821 cagcttgaaa aaggaatggt tcctaaaaaa 

15891 atcgctttcg actttggtga cggaggcgaa 

15961 tagaggatag aaatgataac cttatttaaa 

16031 ccatgcaact gtacgcagac cttattccta 

16101 tgaccctatt gttcgagaaa acgtacttga 

16171 acaaacctcg accagaatga tgtcgacgat 

16241 actacctaac caagctacaa agtcaacaaa 

16311 ttgataaatt ccagcaattt gacgagcgca 

16381 ccttgttcat ttcttgcttt aattctttcg 

16451 caattctagc atcaacttcc atgtcgcgag 

16521 tactgcaatg tcaagttcgc tctttctaat 

16591 gtgaccttat attgtttctc agtttctttt 

16661 cgtctttcca atctgctgta agataaccga 

16731 ttgacgcttg ttttatttat attatgatta 

16801 aagttgaact tttttaaata tttttttttg 

16871 aaaattaagt tcatcttcat aagcaagaat 

16941 atttcgtgga ctcctttttt aagttcgtcg 

17011 caatcctttc gagtcgcttt tcattttgtg 

17081 aatagcttga atggcetcaa aaaagtccgt 

17151 caggaaagca aagcgttcca gctagtgatt 

17221 tcagaatatc tttgtagtca atatcagctt 

17291 ccaaatctec gtcctcgtca tcgctttcat 

17361 gtttcgaacc cgaatgctaa ggacttccat 

17431 tcccactcta aatcgtcgta gtcgaagata 

17501 ttgccatttt agtttcctcc ttatgcgata 

17571 tgaacttaac ttggtcgacc gtttcttcca 

17641 ttggccgttt tcgttgataa tttcgtacca 

17711 ttcattttac tacctccact ttttcgtcca 

17781 aagacgttct aggcttaccc atttacgacc 

17851 acaactttca ttcctacttg caaatcttta 

17921 tatagtatta ttatacgata atgagtgaat 

17991 tttttttttc aaaaaaataa cgagccgaag 

18061 cctcatagcc tttacgacgt gctacctttc 

18131 tactttaaag tcatccgcct tggcacagtc 

18201 ttggaaaact cacctatatt agcacaacgc 

18271 ctaaaaagtt gtccaaggtt ataggaaggt 

18341 aaggctgaca atttcactgt ccttaaatag 

18411 ggctctgctc cgctatctag tacatcgcca 

18481 gggcgtctgc acgcgcaacc tggagctcct 

18551 aattccttca aaatagctct tgtccgggtc 

18621 aggtcgaaat atacttgaat ttcatctgta 

18691 cttttacatt tacttttttc gagagatttg 

18761 ctttttgttc tttgccatgc tagtatctcc 

18831 tgcttctcgc gatgcaatag tttcgagaat 

18901 ccagttatgg tggcgtcaat taagtaacca 

18971 atactagcct tttataatag ccatttcctg 

19041 ttgtagacga taaggagttc ctggaacttc 

19111 catgagtttt gaaaatggat aactttccat 

19181 caatccataa ttgaaaaggc ttatcttctc 

19251 tgaaagcgcg attaggtcat ccaggctgtc 

19321 caaaagtaag cgacatttcc aactttctct 

19391 atagtcgcgc agaataaact tcgaatttca 

19461 tcgcattctc gccatgaaac cgcccttcaa 

19531 tccttccttc tttaaatttc gaaatgtgtc 

19601 agtgaatatt cttccacctg ctttttaaat 

19671 tcttgtagga aggttcgcga gtaggaagtc 

19741 tattttagac actaattcag cgtcttgttt 

19811 tcacgctgat taatacaaaa gcacctaaaa 

19881 atattcgacc tgcttctttc ccaacagctt 

19951 ttgagcaagt gcgatattat tctttagcat 

20021 ccactcgggt tgtcatttgc taattgaata 

20091 tagtcacttt ctatcatatt ttcgagcttt 

20161 gtccaagcgc gacaagtgtc gaaatgaaat 

20231 tacatttttc aatatctact tcaagttcga 

20301 tagagccttt tcataacttt ctgctaggta 

20371 ttaccaagat tatcaaaatc agtggcgtga 

20441 tacataaata gaagcagttt tatcttccaa 

20511 tttaaaactg tcgcttcagc tacaacatta 

20581 cgcctaagac ttcagcttgg teat tgt tea 

20651 tgaaaatttc attttatttt ccctttattt 
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taaggagaaa cttgatgaat tatatggtaa aagtcattct 
catgacttgc tcaatggttt atttggttac aggtaagcaa 
tttggcgctc tegtaagetc tgcggcgttc tattcgacac 
acgcgcatac aaaccaattc ccacgcgcag agetagtget 
ggaggaaaag tacagectaa ttcaggagee actgactact 
ttatagaatg caagacagtt atgaagecac aaagttcagt 
tgaacaggaa aggttcgetc aaaaactcga etattctget 
cagtatatag caatgtctat aagtcagttc aagegaatat 
ataaacagtg aaggaacagt tactccaatt aaagggtcag 
tacaagagga cgatatacag ttcgttgata taactggact 
get cat t tea eggagcegtg taggagtttc aaaatatggt 
ttcctacagc aegecaaaga agaagegetc gaetttgeta 
agcaaaataa atagacctat ttctaggtct atttttatta 
atcttctagc gcagatacta ggtggcggct ttcttgttta 
ttaaggcgtt cgattcttgt agttaatttc ttgatgattt 
taagtgtgac tccagtttca gegacaggae atgctttgaa 
aactgagect aggtctaagt acaagttagg attgattcca 
acaggaatgc tttcatagtg gaaagtgtag ttcttgtgac 
aataaagtgt tgtttccata attgacctct ttctgcgtcc 
tacgataata aaggaataaa gtcaagcact ttttacaaaa 
aaaataaaaa gecctaataa tagagctttt agtttagcag 
ctgtccgtac tggtaagaaa tagctgattc aatatcegge 
atagtacagt tacaatgacc tattcttgac tgaagttcct 
tatcaattgt tttcgagtct aggtgagtga aggaacttgc 
tattgaaact cctttataag aaagctcatt ccgtgtatag 
tgaatttgag ggttaggaga gtttcgataa gctacaaaat 
cagtatgatt gttgataaat accttcattt tataaccctt 
ageaggegat aacttcaacc cactcgtcgt cctcaccttc 
gtcctcaaca tcttcgaatc cttcattagg tgeatatect 
gttacaagac gtccgtcaaa ttttactgtt tcctttactg 
tatagtttga taatttgaga ttcgatgtca ccatagttga 
tgtattcgee catgtcttcg attcttccgt cttgaatcat 
ccattcatca ccgaattgtt tgattgette tttaactgtt 
ttagtgattc gttatcatag aaccgaatac gtccatcact 
ttgaeggtea gttactttaa attcagtacc ttttgcattt 
acttttacca ttttatatga ctcctttatt tgtttttctt 
aaagtcaagt gtttttgtaa acttttttaa attttttaat 
etaegttatt tatttatctg ctcaagggct tgttgaattg 
cagctttaga gccgggtgaa aagtcccaaa cagtttcgtc 
gagcaggagc tggatagctt tttgccattt ccgccaattc 
aaaacaagtg ctctagtatg ctggctagac ataatgaact 
cctttggaaa ctcataaggc tctttgacat cgtatttgaa 
ttcaccgtct ttatacataa taccttgaac aatttcagta 
accgtgtgac aataggcttt aagaactgea aaaaaacctg 
taacagtcat ccaaggctga ggtttcttac aaacaatcct 
aatagtgcct aacattgtca gcctgttttt atttatataa 
ttaggcagee acttaacagt gacttttcta taagcgattg 
tagggataag cattttcctt ttgacattta ctttttttcg 
atttctgttg gtcttgcttt ttagctctgt tcagttcagc 
atgcctgttc ataggctcac aatattcege caaagatttg 
tctattgact ccttaccata aaatacaaaa tegtcttgge 
cgcgtgtttc aattttaact aagctcattt tcacccaaac 
gaacaggagc ctcctttttt catcgtctac ttgtttaata 
ttattttcca tagtttcacc ttattccatg tacccgtcaa 
tataaggecg tgataatttt agtccagttc ccactacatt 
tagctcgagt tcgattacaa ggttgccagt atcaatttca 
agtgcttcac gatacctatc atatgtcgee tettegtcaa 
ttttagttac cgccttccaa aatttcatcg ggcataatct 
tataegctte aagactgaag tcatgttgag gtctgtcaat 
ctgaagegea ttttttgttt getegctagg taggaccata 
cgaatggcta aggctgacaa aaagcctttg aggtatgaat 
ggtcaatacg gtaacgaaga taaagcaaag cagcctcata 
ttcgccgaag aaaattattc gacttttatt caagegcata 
ttagtcgega gaatatgacc aagttcacgt tcccaccaaa 
gagaagtctc gaactgttta ggttcatcaa attgttcaac 
caacttttga gecataagaa gggcagtttg cccctcttcg 
agatttttaa ttttttcaat aattttttcg ttattcatat 
cgaaaagtca atgtcgtcta cttcaattgt cttgtcataa 
aggctacaaa acatcttttc attatggtcg aaactttcag. **' 
gaacgacaat agtatcaaca tttcgaagcg ataaaaaggc 
aataactcca gctgaaggct tcaatccttc agctagaatt 
taaagtttca ttagttactt ccttacatat ctagagtcac 
gtcctactca atagcttcct ettegctgag tttttcgagt 
gcaaagttcg aacegttgag aatgttttcg atatttcctg 
ctaccattag gtattcatta gtaagtgctt tagcaaagtt 
gtttttcttt atactattat tatacaataa tgattgaata 
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20721 aagtaaagca ttttttataa aaaagttgaa ctttttttac aattttttga actatttaaa aattataaaa 

20791 cgggcggaaa atttaggcga caatttatac ccattttcaa cctcatttat aaacaatcta atacagaaaa 

20861 ggacttaata agtaaataaa aaagcgccct gaaaatacct acaaatccca tagtccgtaa gtaaaaacaa 

20931 aaattagggg cgacataaaa gtcgagcact atcttaatct attaccagtc tcatatacaa tcgacacaga 

21001 tttagcaggc ttttagcaaa ctttcgaaca gcatgaaaaa gcatacaatt agaggaacag attatagaaa 

21071 aagcacctcc acaaacaagt tctcaaaacg ctctcaaaaa ccgtaaaatt agtaagtttg aacttttcga 

21141 acttctaaac ttttcgaata accgagccta atttagaggt cgaaaaactc aatttctcga aaagtcgaac 

21211 ctgctcgaaa acctcaaaac acccgaaaag tcgagcatag aaaggggtcg aaaagtcgag aatgctcgaa 

21281 aaactcaacc ggttcgaaaa cctcaatcct tcgaaaagtc gaaccattcg aaaagttcaa aagttcgaaa 

21351 aactcaacca ttcgagagta ggaattaagg acataccagt tcaacctttc tagcttcaaa atcactcttt 

21421 ttc teat tat aggactataa attcagtcaa ttgtaagtca cgegcaaatt tgttacaatg taaacgataa 

21491 aatataaagg agggtcaata aatggcgaaa gctactggac caaaagttcg aagaggaaaa actcctccac 

21561 ggecaaaaga caaaaaagga atcaaagcaa atgcgcgtgt caataaagac cagttegtag agtatgacta 

21631 taaaggcatc aagatgacaa ttaaggaacg tgatgetaga atgaaattgg aatttattag aggcatgact 

21701 attcaggaaa ttgcagcccg ctatggatta aatgaaaagc gtgttggcga aataeggget cgcgataaat 

21771 gggtgaaggc taagaaagag ttcgagaatg aaaaggctct tgttactaat gatacattga ctcaaatgta 

21841 tgcagggttt aaagtctcag tcaatattaa atatcacgcc gectgggaga aactaatgaa categtcgaa 

21911 atgtgtttag ataatccCga cagatattta tttactaaag aaggaaatat tagatggggc gcattagatg 

21981 tcctttcgaa ccttatagat agagctcaaa aaggacaaga aagagcgaat ggaatgette eggaagaggt 

22051 tcgatataga ctacaaattg agegegagaa aattacattg ctccgggcca aaatgggcga ccaggaaatt 

22121 gaaggegagg ttaaagataa ettegtagaa gcactagata aag cage tea agccgtttgg caagaattta 

22191 gtgaegcaac aggttcctac attaaaggag tgactgataa tgacaataag cctgagaaat aaactaccta 

22261 agttcaactt cgtccctttt agcaagaaac aactccagct cctaacatgg tggacaaagg gctcaccttt 

22331 tcgaactttc gatategtea tagcagaegg ttccattcgt tcaggaaaaa cagtatcgat ggctctttca 

22401 ttttcccttt gggecatgae ggaattcaac ggacaaaact ttgccatctg tggtaagaca attcactcag 

22471 ctcgacgaaa tgctattcag cctctaaagc aaatgetcac aagtcgcggg tatgaaattc gagatgttcg 

22541 aaatgaaaat ctactCatta ttagacactt tagaaatggc gaagaaattg tcaactactt ctatatattt 

22611 ggaggaaaag atgagtcgag tcaagaccct atacaggggg taacattagc aggtatcttc tgtgatgagg 

22681 tggcactgat gectgaateg tttgtcaacc aagegacagg gcgctgttcc gtaacaggtt cgaaaatgtg 

22751 gttctcttgt aacccggcca atcctaatca ctacttcaag aagaactgga ttgacaaaca ggtcgaaaag 

22821 cgtatcttat atcttcactt tacaatggac gacaacccta gettgaegga tagcattaaa aggegctatg 

22891 agaaaatgta tgctggagtc ttcaggaaaa gatttattct eggectttgg gtaacagcag atggtctagt 

22961 ttattcaatg ttcaatgaag agcagcatgt caaaaagctc aatatagaat tcgaccgttt attegtagea 

23031 ggcgactttg gtatctataa tgcaacaacc ttcggccttt atggattctc gaaaegtcat aagcgctacc 

23101 atctaattga gtcatactac cactcagggc gcgaggcgga agagcaacta actgaggegg atgttaattc 

23171 gaatattcaa tttagttcag ttctacaaaa gactactaaa gagtaegcaa atgatttagt cgatatgata 

23241 cgaggaaagc aaatcgaata tataattctc gacccgtctg ettctgetat gattgttgaa cttcaaaagc 

23311 atccttatat agctagaaag aatatcccta tcattcctgc tcgaaatgac gtgacgcttg gcatttcatt 

23381 teaegctgaa ctcttggctg agaatagatt tacactcgac cctagcaaca cgcacgacat tgatgaatac 

23451 tatgettaca gctgggacag taaagegage caaaegggag aagatagagt cattaaagag catgaccact 

23521 gcatggatag gaacagatat gcctgtctca ctgacgctct aatcaacgat gaetteggtt tcgaaataca 

23591 aatattatcc ggaaaaggcg ctagaaacta actaaacact tttatagaaa ttagtgtata atataagtag 

23661 gaggatttta aacatggcta aaaaatcaaa agctatctca cacacagacg aactgattag tcagtcgttt 

23731 gacagcccct tggcaaagaa tcaaaagttc aagaaagagc ttcaggaagt tgaaaagtat tatcaatact 

23801 tegaeggatt tgatgtcacg gacttgaata ctgactatgg gcaaacatgg aagattgacg aagactcagt 

23871 cgactataaa cctactcgag aaattcgaaa ctatattcga caacttatca aaaagcaatc aegctttatg 

23941 atgggtaaag agecagaget tatctttagt ccagttcaag acaatcaaga tgaacaggct gagaacaagc 

24011 gtattctatt cgactctatt ttaaggaatt gtaaattctg gagcaaaagt acaaatgeat tagtcgaege 

24081 cacagtaggt aagegggtat tgatgacagt agtagcaaat gccgctcaac aaattgacgt ccagttttat 

24151 tcaatgcctc agttcaccta tacagttgac cctagaaacc cttccagctt gctttctgtt gacattgttt 

24221 atcaggacga gcgtacaaaa ggaatgagca ctgaaaaaca actttggcat cattatagat atgaaatgaa 

24291 agctggaaca agtcaatcag gaattgeaac agctttagaa gacattgaag aacaatgttg gctcacttat 

24361 gecttaaegg atggagagtc gaaccaaatc tatatgacag aaagtggcca aactactatc aaggagacag 

24431 aggctaaact tgtagaaatt gaagacaacc taggaaacaa gattgaagtt cctttaaaag ttcaagaatc 

24501 cgccccaacc ggcttgaagc aaattccttg tcgagttatt cttaatgaac cattgactaa tgacatatac 

24571 gggacaagcg atgtcaaaga ccttatcaca gtagcagata acttgaacaa aactattagt gacttacgag 

24641 attcacttcg atttaaaatg ttcgagcagc ctgttatcat tgatggctct tctaagtcaa ttcaaggaat 

24711 gaagattgcg ccaaacgctt tggtcgacct taagagtgac cctacttcct caateggegg tactggaggc 

24781 aagcaagccc aagtcacttc catttcagga aacttcaact tccttccagc ggctgaatat tatttagagg 

24851 gcgctaagaa agecatgtat gaactaatgg accagccaat gectgaaaag gtacaggagg cgccatcagg 

24921 aattgcaatg cagttcttat tctacgacct aatttctcga tgtgacggaa aatggattga gtgggatgat 

24 991 gctattcaat ggctcattca aatgctggaa gaaattttag caacagtgaa tgttgacttg ggaaatattc 

25061 ctcaagatat tcaatcaagt tatcaaacac ttacgacaat gactatcgaa caccactatc caattcctag 

25131 cgatgaactt tetgetaage aacttgeget cactgaagtt caaactaatg tacgcagcca ccaatcttac 

25201 attgaagaat tcagtaagaa ggaaaaggcg gacaaggaat gggaaegcat tttggaagaa cttgctcagc 

25271 ttgacgaaat ctcagctgga geattgectg tattagcaaa cgaattaaac gaacaagagg agectcaaga 

25341 tgaaacgagt gaagaagacg aagttgatga caaagaaaaa gaacaaactg aacaaccaac cgaagaagga 

25411 gtcgacccag aegttcaagg ttaattgtga ccattgtgag cataagttcg accttacatc taaacagatt 

25481 atttcgaaac atatcgaaaa gggegtagag tggagattct tcgaatgtcc taagtgccat tateggtfeca 

25551 ccacttatgt aggaaacaag gaaattgaaa accttattcg atttagaaat acttgtcgag ctaaaatgaa 

25621 gcaggaactt caaaaaggag ctgctgctaa tcaaaacact taccattcat atcgaattca ggatgagcaa 

25691 gctgggcata aaatctcagg gcttatggcg aagctaaaga aggagataaa cattgaaaaa cgagaaaaag 

25761 aatgggtatc tatatagctg ggaaaaggct attcatgaaa ataatattcg tctaaccctt gaacaggaac 

25831 aagctgtact gaaagectte agegatgeag gaactgattt aattgcaaag attaaaaagt ctcgaaatgg 

25901 atacttgect aaaagaatct ataaagacta cgcttacgac ctgcacgctg ttcttgttca actaatgact 

25971 gaatactctc ataaggegge aatgaacgea gtagatggcc aggtagttca tattctacaa gtattagcag 
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26041 aagatggaaa tgctacggct gaaaagttcg aaaaggaagt cagggctgca tctttagtat tttcacgaag 

26111 agcagccgag gcagttgtca aaggcgaaat ctataaggac ggcaaaaacc tctcgaaacg tgtttggtct 

26181 tcagccgcac gcgcaggaaa tgatgttcaa caaatagtca cacaaggcct agcaagtgga atgtctgcta 

26251 cagatatggc taaaatgctc gagaaatata tcgaccctaa ggttcgaaaa gattgggacc ttgataagat 

26321 agctgagaag ctagggaaac ctgctgctca taaatatcaa aatctcgaat acaatgccct tcgacttgct 

26391 cgaactacca ttagccattc cgccacagct ggagtgagac aatggggcaa ggttaatcct catgctcgaa 

26461 aagttcaatg gcattctgtt cacgccccag gtcgaacgtg tcaagcgtgt atcgatttag atggtgaagt 

26531 atttcctatc gaagaatgtc ctttcgacca tcctaatgga atgtgctacc aaactgtatg gtacgaaaac 

26601 tcactcgaag aaatcgctga tgagttgaga ggctgggtag acggagaacc taatgatgta ttagacgaat 

26671 ggtacgacga tttaagttca ggaaaagttg agaaatacag cgacctcgac tttgttaaaa gttattaggc 

26741 tcggttcaat accgagtctt tttgtctata aattgtctaa tttcgagaac cttcgaaaag tagtaaaatg 

26811 atattcagtt atgttataat ataagttgaa aaggaacctt gtcgccttaa tgactcgaaa ttggtttcac 

26881 tgttccaatt aaataaaaac agcagattca gccggagggc ggaaaactca ggaggaaaat aaatggctta 

26951 tcaattagaa gacttgttaa aaggtctaga tgaaccaact atcaaacagg tgaaggaaat tatttcgaaa 

27021 acttcgaaag aactcgatgc taaaattttc attgacggcg acggtcaaca ttttgtacct cacgcacgtt 

27091 tcgatgaagt tgttcaacag cgcgatgcag ctaacggctc aatcaattct tataaagaac aagtcgcgac 

27161 gctttctaaa caggtcaaag ataacggtga tgcgcagacc actatccaaa accttcaaga gcaactcgac 

27231 aagcagtctc aacttgcaaa aggcgctgtg attacttcag ctcttcatcc gttgattagt gactccattg 

27301 ctccagcagc agacattctt ggatttatga accttgacaa cattacggtc gaaagtgacg gtaaagttaa 

27371 aggtcttgat gaagagttga aagctgttcg tgagtctcgt aaacacttat tcaaagaagt cgaagttccc 

27441 gcagaacaag aggctcaagc taagtcgcca gccgggactg gaaatttagg aaatccaggt cgtgtcggtg 

27511 gtggtgttcc cgaacctcgt gaaatcggct cttttggtaa gcaacttgct gctgctcaac aaacggcagg 

27581 agcacaagaa caatcatcat tctttaaata ataggaggaa ctaactatgc ctaatgtgcg agttaagaaa 

27651 actgatttta atcaaaccac tcgaagcatt gtcgcaattc ctgaccacta cgttgctttg gctgctcaaa 

27721 ttccagctac cgcagcaact caagtaggga acaagaaata cattcttgcc ggaacttgcg tgaaaaatgc 

27791 tactacattt gaaggacgca aaactggact cgaagtagta tctaccggtg aacaattcga cggagttatc 

27861 ttcgctgacc aagaagtgtt tgaaggtgaa gaaaaagtaa ccgtgacagt attagttcac ggattcgtca 

27931 aatatgcagc ccttcgaaaa gttggcgatg ctgtgcctga atctaaaaac gcaatgattc ttgtcgttaa 

28001 ataggaggaa ttatagatga atatttatga ttatatcaac gcaggggaga ttgctagcta cattcaagca 

28071 cttccttcaa acgctcttca ataccttgga ccaactcttt tccctaatgc tcaacaaaca gggacagaca 

28141 tttcatggct caagggtgca aataatttgc cagtaactat ccagccatct aactacgacg cgaaagcaag 

28211 tcttcgtgaa cgtgctggat ttagcaaaca agctactgag atggcattct tccgtgagtc tatgcgactt 

28281 ggtgaaaaag accgtcaaaa cttgcaaatg ctattgaacc aaagttcagc tcttgcccaa ccacttatca 

28351 ctcaactcta taatgatact aagaaccttg tagacggtgt tgaagcgcaa gcagaataca tgcgtatgca 

28421 attgcttcaa tacggtaaat tcactgtcaa atcaactaac agcgaggctc aatacactta cgactacaac 

28491 atggatgcta agcaacaata tgcagtcact aagaaatgga ctaacccagc tgaaagtgac cctatcgctg 

28561 acattttagc agcaatggat gacatcgaaa atcgtacagg tgttcgccct actcgaatgg tcttgaaccg 

28631 aaacacttat aaccaaatga ctaagagtga ctctatcaag aaagctcttg caattggtgt tcaaggttct 

28701 tgggaaaact tcttgcttct tgcaagtgac gctgagaaat tcatcgctga aaaaacaggt cttcaaatcg 

28771 ctgtctactc taagaaaatt gctcagttcg ctgacgctga caaacttcct gacgttggta acattcgtca 

28841 gttcaacttg attgaegacg gtaaagtggt attgcttcca cctgacgcag ttggtcacac ttggtacggt 

28911 actactccag aagcattcga cttggcttca ggcggaacag acgctcaagt tcaagttctt tcaggcggac 

28981 ctaccgttac aacttatctt gaaaaacatc ctgtcaacat tgcaacagtt gtatcagctg ttatgattcc 

29051 atcattcgaa ggaattgact atgtaggagt tctcacaact aattaggagg tcgctatatg gctacattga 

29121 aagctcttag caccttaatc gtttccggag cagtagtgca ttcagggtcg gtattttctt gccctgaagc 

29191 gcttgcttcg tctttaattg aacgcaattt tgcgttcgag attaaggcgg ctgaagatgg agaaacggta 

29261 gaaactgttc ctcaaacaat tgaatcagtt gaagaaattg acgaagttga acaaatgcgc gaagagtatg 

29331 cggctaaaac cgttcctgag ctcgttgaat tagcaagagc taatggaatt gacatttctt caatttctcg 

29401 aaaaagcgaa tatatcgacg ctttaattaa gtacgaacta ggagagtaaa atggcagctc aaacggacat 

29471 cgaattagtc aaaatcaata tcgataacga taattctccg tcaccaatga ctgaccaaag tatctcagct 

29541 cttttagaca agcataaatc tgtcgcctat gttagttata tgatttgctt aatgaagacc cggaatgacg 

29611 tggtaaccct tggacctatc agtctaaaag gtgacgcaga ctactggaaa caaatggcgc aattctatta 

29681 tgaccaatat aagcaagaac agcttgaaac tgatgaaaag tcgaacgctg gttcgacaat cttaatgaaa 

29751 agggctgatg ggacatgagt tatgacgtga attatgttaa gaatcaagtt cgtagagcca ttgaaaccgc 

29821 tcctactaaa atcaaggtac ttcgaaactc ttgggtcagt gatggatatg gaggaaagaa aaaggataaa 

29891 gcgaatgaag tcgtagcaga cgaccttgtt tgtttagttg ataattcaac tgttcctgac cttttagcca 

29961 attctactga cgcgggaaaa atttttgccc aaaatggagt gaaaattttc attctatatg atgaaggcaa 

30031 aatcattcaa cgagccgata ctatcgaaat taaaaactca ggaagacggt acagggtagt agaaacccac 

30101 aatcttctcg agcaagacat tttgatagaa cttaaattgg aggtgaacga ctaatgtctc agcctgaatt 

30171 agtatggaag cctgaagaat ttgttagtaa ctgtgaacgg tatcgaaaca agtttcaagt cgctgtcata 

30241 acagtctgcg aagtcgctgc tactaagatg gaagaatacg caaagacgca tgctatttgg acagaccgta 

30311 cagggaatgc tcgacagaaa ctcaaaggag aagctgcttg ggtaagcgca gaccaaatca tgatagctgt 

30381 atcacatcac atggactacg ggttttggct agaactagct catggtcgaa aatacaaaat tctcgaacag 

30451 gctgtagaag acaatgtcga agaacttttt agagcgttga gaaggttatt agactaggag tgaacatgac 

30521 taaacgaacg acaatgatgg acagattgaa ggaaattctt cctacatttc agctctcgcc tgctcctatg 

30591 cttccaggag ttgaatttga cgagcaagat acagataggc cggatgacta cattgttctt cgatatagtc 

30661 atagaatgcc cagcgcaaca aatagcctag gaagttttgc ttattggaaa gttcaaatct acgtccattc 

30731 aaactcaatt attggtatcg acgaatatag cagaaaggtt cgaaacatta tcaaggacat ggcjctacgaa _ -= ? 

30801 gtaacctatg cagaaactgg tgactacttc gacacaatgc tttctagata ccgactagaa atcgaatata * 

30871 gaattccaca aggaggaaac taataatgag taaagacatt ctttacggaa tcaagctcgt gcaaa?cgag 

30941 gagcttgacc cattgactca gttgccaaaa gtcggcggag ctaactttgt cgtagatacg gcagaaacag 

31011 cagaactcga agccgtgacc tcggagggaa ctgaagatgt gaaacgcaat gacacgcgca ttcttgctat 

31081 cgtgcgtact ccagaccttt tatacggtta tgacttaaca ttcaaggaca acacgtttga ccctgaaatc 

31151 atggccctaa ttgaaggtgg tacagtacgt caacaaggcg gaactattgc tggatacgac accccaatgc 

31221 ttgcacaagg tgcttctaat atgaaaccat ttagaatgaa catctatgtg ccaaactatg taggtgactc 

31291 aattgtcaac tacgtgaaaa tcactttgaa taactgtacc ggtaaagctc cagggctttc aatcgggaaa 
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31361 gagttctacg ctcctgagtt caacatcaag 

31431 tggactatgt ggcacaactc ccagcggttc 

31501 cgccgacgca gttcgagttg aagcaggtaa 

31571 aaggctttca aaggctggaa agttgaagga 

31641 accgagacgt caaactcgta gcacaatttg 

31711 atcacagctg agcagtttaa gcaacttgca 

31781 aacctatcca tgttaaaatt cgagcagcag 

31851 gcttttaggt aaagtgacag aactgtttgg 

31921 tcaattactg accaacagaa gaaagaagcg 

31991 tggccgaact tcttcgagta ttcgcagaag 

32061 tatgacagat gagcaactta tgacaatctt 

32131 cgtacagacg aaggaaatgt ctaatgtcat 

32201 gtcgggatgc aaactgactt aggcaaatac 

32271 aggaagacaa gactcctagg tatcctggtg 

32341 actattttca gtcgctcctc tttttgtata 

32411 aaatgacttt ggatatctca aacttcacaa 

32481 accagagccc tcgaagtcct ttcaaattgg 

32551 gttacccttc ctcttatggg atttgcagcc 

32621 cccgtgttca agctattgca ggagcgacag 

32691 tggtgctaaa actgctttta gtgcaaaaga 

32761 caggtaaatg aaatcatgga cgctatgcca 

32831 ccgcgagctc cgaggccatg gctagttcac 

32901 ggctgacgta tttgctcgag cagcagctga 

32971 tacgtcgcac ccgttgctca ctctatgggc 

33041 ccgacgccgg tattaagggc tcgcaagccg 

33111 tacgaaagcg atggtcaaat caatgcagga 

33181 ccactaagag aacaaatcgc tcaactgaaa 

33251 accttgttac cttgtatggc caaaactcgt 

33321 attggataag atgaccaatg ctctcgtgaa 

33391 gacaaccttg ctagtaaaat cgagcaaatg 

33461 tccttgagcc tgcacttgct aaaatcgtgg 

33531 acctatcggt caaaagatgg ttgtcatatt 

33601 gcaggaatgg tgatgacaac tattgtcaag 

33671 gaacgatggg aaccattgca ggagttatag 

33741 cacaaaatcg gagagattta gaaactttat 

33811 gcgttggaat ggctacttcc acgactgaaa 

33881 aagagttcgg tcagtctgta gggtctaaag 

33951 ggcaggaggc tcgattggtc agttcattgg 

34021 ggaggagtca tttcaattgc tgtttcactt 

34091 cactcgggat tgctattagt ctgctagttt 

34161 agacggaatt actcaagtat tcgaaaactt 

34231 taccttccag tctttgtcga aaaaggaact 

34301 ttcctcaagt agctgaagtg atttcacaag 

34371 tcaattagtc gaagcaggaa ttaagatact 

34441 accattcaag cagctgttca aattatcact 

34511 ttcaagcagg ccttcaaatt ttgtcagctc 

34581 agcagctgtt caaattatca tgtcgcttgt 

34651 gcgatgcaga ttataatggg tctagtcaac 

34721 ttcaaattct aatggcttta atcgagggac 

34791 aaccattact tcactattag aagcaatctt 

34861 cttttatcac ttcttcaagg gttgctaaat 

34931 cggcacttct taaagcagtt atcgacttcg 

35001 attgattcaa ggtattgctt cacttctcgg 

35071 gttagcaaga ttgctagctt tgtgggacag 

35141 gtggtattgg gtcaatgatt ggttcagctg 

35211 ggttactgga ttcgctggac aaatggtaag 

35281 agttccatgg taagttctgc ggtaagtgcg 

35351 gattcttagg tattcactct ccttcacgtg 

35421 aaatggtatt ggtaacatga ttcgaactac 

35491 gctctcagcg acgtgaagat ggatattcaa 

35561 agatggctga ccaacttcct gaaactcttc 

35631 gcctcgagtg gactcgttca atacaggaag 

35701 caaggcgagc aaaccgttgt caacattgga 

35771 cgagaggatt gtataataga agtaaagaaa 

35841 aaatagatgg ctagcagaca gacgctattg 

35911 tagaatatgt aggactcact ttcgcaggat 

35981 agtattagat tctccgtcta atgctatgtc 

36051 accgaaaagc aagttaatca aaaatacagg 

36121 tttcgacact tgaagacccc ggatactatc 

36191 tgtagacgtt caagccttta aagatacttc 

36261 gagcacagcg actcaactgt tcgaaaggtt 

36331 acccaggaag acctactcga caatttagag 

36401 aattggcgaa aaaagttcag gacagtttgt 

36471 attattattc taaatcttgg aacttttgaa 

36541 ttagatacat taaacgaggc gcattcttca 

36611 agccgatgac gcagcagctt ggacctctac 
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gcacgtgaag caaccaaagc aggtttgcca gttaagtcaa 
ttcgtcgcgt gacattcgac ttgaacggtg gaacaggaac 
gaagatttct ccaaaaccag ttgaccctac cttaacaggt 
gaatcaacta tttgggacct cgacaaccac atgatgcctg 
catagaaatt tagaaagaag ggtctgttat gactaatatt 
tttcaaatca tcgcacttcc aggattttca aaaggtagtg 
gtgtcatgaa cctaatcgct aacgggaaaa tccctaatac 
agaaacttcg acagtcacta aagacaatgc tagtctagca 
ctcgaccgat tgaacaaaac cgataccggt attcaagaca 
cttcaatggt agagcctact tacgctgaag tcggcgagta 
cagtgcaatg tacggtgaag tgactcaagc tgaaaccttt 
agcagtcgcc actgaatttc atattagacc tagcgaggtg 
tgcttcgacg cagcagccgt tgcttatatt agatatttgc 
acgaaaagaa aaatccagga ttgcaaatgc ttatggagtg 
tagaaaggaa attacatgga ttttgggtca attgcagcaa 
gtcaattaaa tcttgctcaa agtcaagcgc aacggctcgc 
ttctgcttta acaggattag ggaaaggact tacgactgcg 
gcctctatta aagtagggaa tgaattccaa gctcaaatgt 
cggaagagct tggtagaatg aagactcaag caatcgacct 
ggcggctcaa ggtatggaaa atctagcttc agccggtttc 
ggggtacttg acctggctgc cgtatctgga ggagatgtgg 
ttcgagcctt tggattagag gcaaaccagg cgggtcacgt 
tacgaacgca gaaactagcg acatggcaga ggcgatgaaa 
ttgagccttg aagaaacggc tgcgtctatt gggattatgg 
gaaccacgct tagaggcgct ctctcgcgta ttgccaaacc 
attaggagtt tcgttctacg acgcgaacgg aaacatgatt 
acagctactg caggactaac acaagaggaa cgaaatcgtc 
tgtcaggtat gcttgcacta ttagacgcag gtcctgagaa 
ctcggacgga gctgctaagg aaatggcaga aactatgcag 
ggaggagctt tcgagtctgt tgctattatt gttcaacaaa 
gagcaatcac aaaagttctc gaagcattcg taaatatgtc 
cgcaggaatg gttgcagccc ttggaccact gcttctaatt 
ttaagaattg ctattcagtt tttaggtcca gcatttatgg 
caatattcta tgctctggtc gccgtgttca tgatagccta 
caacagtctt gcgcctgcta ttaaagctgg gtttggagga 
gagttaggag aatggttaca gaaggcaggc gagaaggcga 
tgtcaaaact gctcgaacag tttggaataa gtatcggtca 
aaatgttctc gaaaggctag gaggcgcatt tggaaaagta 
gtaacaaaat tcggtctcgc atttctaggg attacaggac 
catttttgac agcttgggct agaacaggtg agttcaacgc 
gacaaacaca attcagtcga cggctgattt catctctcaa 
caaattttag ttaagattat tgaaggaatt gcatctgctg 
tcattgaaaa tattgtgatg acaatttcga cagttatgcc 
cgaagcgctt ataaatggtc ttgttcaatc tcttcctact 
gctttattca atggtcttgt tcaggcactt cctacgctta 
tcataaacgg actagttcaa gcgcttccgg caattattca 
tcaagcacta attgaaaact tgcctatgat aatcgaagca 
gcactgattg aaaatatagg acctatctta gaagcaggga 
ttattcaagt gcttcctgaa ctaattacag cagcgattca 
gtcgaacctt cctcaacttc tagaagccgg agttaaattg 
atgcttcctc aactaattgc aggggctttg caaatcatga 
tccctaaact tcttcaagca ggtgttcaac ttcttaaggc 
ctcactttta tcgacagccg gaaacatgct ttcatcatta 
atggtttcag gaggtgcgaa cctgattcga aacttcatta 
tctctaaaat tggcagcatg ggaacttcaa ttgtttctaa 
cgcaggggtc aaccttgttc gaggatttat caatggtatc 
gcggctaata tggctagcag tgcattaaat gccgttaagg 
tcatggagca gatgggtatc tatacgggtc aagggttcgt 
acgtgacaag gctaaagaaa tggctgaaac tgttactgaa 
gaaaatggag ttatagaaaa ggttaaatca gtttacgaaa 
cagctcctga tttcgaagat gttcgtaaag cagccggttc 
cgacaaccct aaccaacctc agtcacaatc taaaaacaat 
acaatcgtag ttcgaaacaa tgacgacgct gacaaactgt 
ccctatcagg gtttggtaac atcgtaacac cgtaaaggag 
gtcgacggaa ttgaccttgt cgacaaaggt gcaaccgtgc 
ttaaggactc aggatttaaa aaccctgaag gcacagacgg 
cgctcttact ggaagcgtga ccttaatgtt ccacggagaa 
cagttcaaac aatttattcg ctcgaagtca ctttggagaa 
gaacgggaaa atttttagga gaaaccgagc a"aggaa&ftpt~- 
ccttgtagtt aaattaggga ttcagttcaa agatgcttac 
tataagtttc aacccgcctt gggaggcgat agcttaccta 
tagaaataag aactacttct caaatcaaag gatattttcg 
tgagttcggt actaattcag tattgatgga aagtggctcg 
cttattaaaa ttagcagtgc aaatcaagcg actaacttat 
agattcctaa tggaaattca acaattacca ttgaataccg 
tcttcccgct caagttgaac tgtttctaaa tccgtcttac 
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36681 tattagaaag ggaatatatg attgacaata atttacctat gagcccaact cctggcgaaa ttgttcaagt 

36751 atatgaccaa aacttcaatc taattggagc aagtgatgaa atctttagca agcattacga agacgaaatt 

36621 gtgactcgag cccgaggaaa agaaactttc acttttgaaa gtattgaaac ctcatctatc tatcaacact 

36891 taaaggttga aaacattatc cagtatggag gaagatggtt tcgaattaaa tatgctcagg acgtagaaga 

36961 tgtcaaaggg cctaccaagt ttacctgcta cgcattatgg tatgaactag cagaaggcct gcctaggaag 

37031 ttgaaacacg ttgcttcttc tgtaggcgct gtcgcgctag atattatcaa agacgcaggt gaatgggttc 

37101 gactagtttg tcctcctgac ggtgctaaca aacaagttcg aagcataaca gccgcagaaa attcaatgct 

37171 ttggcatctt cgatatcttg caaagcaata caatttagaa ttgacatttg gttatgaaga aattatcaag 

37241 caagaggtta gaattgttca aaccgttgta tttcttcagc cttatgtcga gtctaaagta gactttcctc 

37311 ttgtagttga agagaatttg aaatatgcca ctaggcagga agattctcga aacctgtgta cggcttacaa 

37381 gttgacaggc aaaaaggaag aaggcagtca agagccttta acgtttgctt ctatcaacaa tggaagtgaa 

37451 tatctcattg atgtttcgtg gtttactaca cgccacatga agcctcgata tattgctaaa tctaaaagcg 

37521 acgaacattt tagaattaaa gaaaatttga tgagtgctgc gcgtgcttat cttgacatct acagtcgccc 

37591 actaattgga tatgaggctt cagcggtcct ttataacaag gttcctgact tgcatcatac tcaactaatt 

37661 gtcgacgacc attatgatgt tatcgagtgg cgaaagatat ctgctcgaaa aattgactac gacgaccttt 

37731 caaactctac tatcattttc caagaccctc gaaaagactt gatggacttg ctaaatgagg acggcgaagg 

37801 agtcctttca ggggaaactg taaatgagtc ccaagttgtt attagatacg cagatgacat tttagggact 

37871 aattttaatg cagaatctgg gaaatacatt ggtgtcctta atactaataa gaaaccgagc gaattagttc 

37941 ctgacgactt tacatggatt cgactagaag gtcctaaagg tgacgcaggt ttaccgggag ctcctgggcg 

38011 tgatggagtc gacggtgtac ctggaaagag cggagtaggg atagcagata cagctatcac ttatgctgta 

38081 tccgtttccg gaacgcaaga gcctgaaaat ggatggagcg aacaagttcc tgaactcata aaaggtcgat 

38151 tcttgtggac taaaacattt tggagatata ctgacggctc acatgaaact ggatactccg ttgcctatat 

38221 agggcaagac ggaaattccg gaaaagacgg aatcgcaggt aaggacggag taggtatagc cgcaactgaa 

38291 gtcatgtatg caagttcgcc atctgctact gaagctccag ctggtggatg gtctacgcaa gttcctaccg 

38361 tcccaggtgg tcagtattta tggactcgaa caagatggcg ctacactgac caaactgatg aaattggata 

38431 ttcagtttca agaatgggcg agcagggtcc taaaggtgac gcaggtcgtg acggtattgc aggaaagaac 

38501 ggaatagggt tgaagtcaac ttcagtttct tatggaatta gtcccactga ttctgcgatt cctggagtat 

38571 gggcttcaca agttccttct ttaatcaaag gtcaatatct ttggactcga actatttgga cctataccga 

38641 ttcaactacc gaaacgggct atcaaaaaac ctacattcca aaagacggga atgacggtaa aaatggaatt 

38711 gctggtaagg atggggtagg aattaagtct acgaccatta cctacgcagg ctcaacctca ggaacagttg 

38781 cgcctacttc aaattggact tctgctattc caaatgttca accgggattc ttcttgtgga cgaaaactgt 

38851 ttggaactat actgatgaca ctagcgaaac aggttactca gtttccaaga taggtgaaac aggtcctaga 

38921 ggagttcaag gtcttcaagg tcctcaaggg cttcaaggaa ttcctggacc tgcaggagct gacggacgtt 

38991 cgcaatatac tcacctcgct ttctctaata gtccaaacgg tgagggattt ag teat act g acageggacg 

39061 ageataegtc ggtcagtatc aagatttcaa tcccgtccat tcaaaagacc ctgcagccta tacatggacg 

39131 aaatggaagg ggaatgaegg agctcaaggg atacceggga agccaggcgc agaeggtaag actaattatt 

39201 tccatatagc ttaegcttea agtgcagacg gatcacgtga gttcagtttg gaagataata atcaacaata 

39271 tatgggttat tactccgatt atgagcaagc agatagcagg gatcgaacta agtatcgatg gtttgacege 

39341 ettgecaatg ttcaagtggg aggtcgaaac gagttcctta attctttatt tgaatttggt ttaaaacctc 

39411 gctattctag ttacaatcta atggacggac aagatcaaac gcaaggacag atatctgeta ctattgacga 

39481 aegtcaaegg ttcaaaggtg ctaactcttt acgacttgac tcaacatgga aeggtaaace gcagaaccaa 

39551 aaactgacct tttctttagg aggagatacg cgattaggta ctccaaccga gtggtctaat ttagaaggtc 

39621 gtatcagttt ctgggctaag gectctagga aeggagtgag ettagctgea eggcegggtt ategtagtaa 

39691 egtatttace gcaaccttaa ccgatcaatg gaagttctac gattttaaat tctttgacaa agttaattca 

39761 aattgtaccg ctgaagcaat tttccatgta ttcactcaaa gttgttcagt gtggctcaat catattaaaa 

39831 tcgaacttgg taatatctct actcctttta gtgaagcaga ggaagacctt aaatatcgaa ttgactcaaa 

39901 agecgatcaa aagctaacta accaacagtt gacggcactc aeggaaaagg ctcaactaca tgacgcagaa 

39971 ctgaaagcta aggctacaat ggagcagtta agtaacttag aaaaggctta tgaaggtaga atgaaagcta 

40041 atgaagaagc tatcaaaaaa teggaagecg acctaatctt ageggcaagt cgaattgaag ctactatcca 

40111 agaacttggc gggctacggg aactgaagaa gttegtcgae agttacatga gctcttctaa tgaaggtcta 

40181 attateggta agaacgaegg tagctctacc attaaggtat caagtgaccg aatttctatg ttctccgcag 

40251 ggaatgaagt tatgtacctt aegcaagggt tcattcacat egataaeggg atctttaccc aatccattca 

40321 agteggcega tttagaaegg aacaatactc gtttaatcca gacatgaacg tgattcggta tgtaggataa 

40391 ggagaataac atgacaaaat ttatcaactc atacggccct cttcacttga acctttacgt cgaacaagtt 

40461 agtcaggacg taacgaacaa ctcctcgcga gttagttggc gagctactgt cgaccgcgat ggagcttatc 

40531 gaacgtggac ttatggaaat attagtaacc tttccgtatg gttaaatggt tcaagtgttc atagcagtca 

40601 cccagactac gacacgtccg gcgaagaggt aacgctcgca agtggagaag tgactgttcc tcacaatagt 

40671 gaegggacaa agacaatgtc cgtttgggct tegtttgace etaataaegg cgttcacgga aatatcacta 

40741 tctctactaa ttacacttta gacagtattc caaggtctac acagatttct agttttgagg gaaatcgaaa 

40811 tctaggatct ttacataegg ttatctttaa ccgaaaagtg aactctttta cgcatcaagt ttggtaccga 

40881 gttttcggta gcgactggat agatttaggt aagaaccata etactagegt atcctttacg ccgtcactgg 

40951 acttagcaag gtacttacct aaatcaagtt ccggaacaat ggacatctgt attcgaacct ataaeggaac 

41021 tacgeaaatt ggtagtgacg tctattcaaa eggatggagg ttcaacatcc ccgattcagt acgtcctact 

41091 ttttegggea tttctttagt agacacgact teageggtte gacagatttt aacagggaac aacttcctcc 

41161 aaatcatgtc gaacattcaa gtcaacttca acaatgette cggcgcttac ggatccacta tccaagcatt 

41231 teaegctgag ctegtaggta aaaaccaagc tatcaacgaa aaeggeggea aattgggtat gatgaacttt 

41301 aatggctccg etacegtaag agcatgggtt acagacacgc gaggaaaaca ategaaegtc caagaegtat 

41371 ctatcaatgt tatagaatac tatggaccgt ctatcaattt ctccgttcaa cgtactcgtc aaaatcctgc 

41441 aattatccaa gctcttcgaa atgctaaggt cgcacctata aeggtaggag gtcaacagaa aaacatcatjg **" 

41511 caaattacct tctccgtggc geegttgaac actactaatt tcacagaaga tagaggttcg gcgteaggga 

41581 cgttcactac tatttcccta atgactaact cgtccgcgaa cttagctggt aactaeggge eggacaagtc 

41651 ttacatagtt aaggctaaaa tccaagacag gttcacttcg actgaattta gtgctacggt agctaccgaa 

41721 tcagtagttc ttaactatga caaggaeggt cgacttggag ttggtaaggt tgtagaacaa gggaaggcag 

41791 ggtcaattga tgcagcaggt gatatatatg ctggaggtcg acaagttcaa cagtttcagc tcactgataa 

41861 taatggagca ttgaacaggg gtcaatataa cgatgtttgg aataagcgtg aaacagagtt tacatggega 

41931 agtaacaaat acgaggacaa ccctacggga actcgaggtg aatggggact atttcaaaat ttctggttag 
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4200X atagctggaa aatggttcaa tccttcatta 

42071 aaacagctgg agacctaaca agcggaaaga 

42141 aaacctgttc ttcaaagtgg gtggaaccat 

42211 acggcatagt atatttgaga ggaaacgcgc 

42281 tcctgaagga tttagaccga aagtttcaat 

42351 ctatgtatat acactgacgg aagacttgtg 

42421 atgtctcatt tcgtatttaa tttgagccga 

42491 tgttgaacct tacaaaatcg cgccaaattg 

42S61 tgtcaaaaca acgattgtga acattgatgc 

42631 gacttgtatg ctgcgaaccg tcgagaactt 

42701 tcgaagatga aattctagct gaacagtcaa 

42771 tctatgccaa tgtggctaaa cgacacagca 

42841 ctgtcctact aaataagtta ttcgaatgga 

42911 aactcttagc actcttaaac agcaggtcga 

42981 gacgtcattc aagacggaac tagaaaaact 

43051 taacaggcta tacaactctc gaccatttta 

43121 cggaaatggt gaagttgaag ccttgtatga 

43191 gaaactatct aacgaacaat atgacgtagc 

43261 ctaattacag gtcttggagc gttgtatcaa 

43331 caacttttgc aggtactgtt ctaggagttt 

43401 tgaggtggaa taatgggagt cgatattgaa 

43471 cttatagcat ggactttcga gacggtcctg 

43541 ctcagccgga gcttcaagtg ctggatgggc 

43611 ggttatgaac taattagtga aaatgctccg 

43681 aaggtgctag cgcaggcgct ggaggtcata 

43751 ctacgcctac gacggaattt ccgtcaacga 

43821 tacgtctatc gcttgactaa cgcaaatgct 

43891 ctggtttctg gtacgctcga gcaaacggaa 

4 3961 gtcttggttc tactttgacg accaaggcta 

44031 tggtattggt tcgaccgtga cggatacatg 

44101 tcaatcgcga tggttcaatg gtaaccggtt 

44171 caacggcgac atgaaatcga atgcgtttat 

44241 cgtctggcag ataaacctca attcaccgta 

44311 agaggaggaa gctcttttct taatattgtt 

44381 gtcgtatatt actctattta cttattcgaa 

44451 gttgatatga ccctttccgc cctacataat 

44521 gcttgacaac attcactcat tatcgtataa 

44591 cattatgtca aaaattaaat tcgaaaacct 

44661 aagtttaaaa tcgtttcaat tttagcagac 

44731 aacttcacct ttcagcttca actctcgaac 

44801 agaagctgct aaacctgcta aaaaggctgc 

44871 cccaaaccta aaaaagaagt ccttgaggaa 

44941 cagttagtga gaaatctact gttcgaaaac 

45011 tcttgaaagt cgaattgttg aagcctttcc 

450B1 cgctctaaga agaacttcgt tactatcgaa 

4 5151 ggttgacaga agaccaaaag aaacttcttg 

4 5221 aatttttaaa ctcgtcaagg aagaagatat 

4 5291 tcgctatgat tgaaatcgtt atagcacgtt 

45361 ggcaagcact gatgaagatg cagttaaaat 

45431 tcttctaata acttcgaact accttataag 

45501 ttcacatctt cggcgaactt gataaagatg 

45571 aagcaatgag cagttttcgt tcaagactac 

45641 gagcatccat gtttcctttt aggcgatgag 

45711 ttagcaggaa ggcaagtttc aaacaCtgtt 

45781 aaaagaagta ggtattcatt caaatgagtc 

45851 ttagtgattg acggagtttc taaacgggca 

45921 ctaacattga aactcttcgc gatgctgtgt 

45991 tggaatggtt attattgacg agattcacaa 

46061 aagctccaaa gttattacaa gatgggactt 

46131 atgtcatgaa gtggctaggg gcggaacatc 

46201 ccagttcaac caaatcactg gatatcgaaa 

46271 agaagaacga aggaagaagt tttagacctg 

46341 cgaaacagtc aaaaatctat aaggaagttt 

46411 gcctaaccct ctagccgaaa cgattcgact 

46481 gatgtcaagc cttgcaagtt cgaaagatgt 

46551 gcgtgatatt tagcaattgg gaaaaggtta 

46621 caacctggta acaggagaaa ccgcagataa 

46691 tctgttactc taggaactat aggtgcgcta 

46761 tcttagatag tccgtggaca cgcgcagaaa 

46831 aagttctgtc actatctaca cgcttgtcgc 

46901 cggaaaggag aattagcaga ttatatcgta 

46971 atatcctgct taaatagaat gaaaactatc 

47041 acggaagaaa aactgcactc gaactagctc 

47111 tcaaattcct gaaaggacgg caaccagaat 

47181 ataatagaaa ggtatataaa tgaaattcac 

4 7251 atgttgaacc gctctctatc tacgattaat 
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caatgtcagg aagaatgttc atcaggacag cgaacgatgg 
ggttctattt aagcaagact tcgaacagaa taattggcag 
cactcaacct atggcgacgc attctattcg aaaactcttg 
ataaaggact tatcgacaaa gaggctacta ttgcagtact 
gtatcttcag gctctcaata actcatatgg aaatgccatt 
gtgaaatcga atgtagataa ttcttggtta aatttagaca 
aatcatgtta taatattttt tagaaaggag gtgagaacta 
tggcagagtt cactattgga caaggagctg aaaagaaact 
aaacgcagta tcaaccgtct ctgaaactct tcatgaccca 
cgagctgacg agcaaaaact tcgcgaaact cgttacgcaa 
agactgaaac agctctaaca gctgaataag gaggcgtcaa 
gtcttgacga cgattattac agcgtgcagc ggagtgctta 
aatcgaataa agccaagagc gttttagagg atatctctac 
cgggattgac caaacgacag tagcaaccaa tcaccaaaat 
caacgttacc gtctttatca cgacttaaaa agggaagtga 
gagagctctc tattttattc gaaagttata agaaccttgg 
aaaatacaag aaattaccaa ttagggagga agatttagat 
aaagaacgtg gtaaccgtag tcgttccagc agcgattgca 
tttgacacta ctgctatcac aggaaccatt gcacttcttg 
ctagccgaaa ctaccaaaag gaacaagaag ctcaaaacaa 
aaaggcgttg cgtggatgca ggcccgaaag ggtcgagtat 
atagctatga ctgctcaagt tctatgtact atgctctccg 
agtcaatact gagtacatgc acgcatggct tattgaaaac 
tgggatgcta aacgaggcga catcttcatc tggggacgca 
cagggatgtt cattgacagt gataacatca ttcactgcaa 
ccacgatgag cgttggtact atgcaggtca accttactac 
caaccggctg agaagaaact tggctggcag aaagatgcta 
cttatccaaa agatgagttc gagtatatcg aagaaaacaa 
catgctcgct gagaaatggt tgaaacatac tgatggaaat 
gctacgtcat ggaaacggat tggcgagtca tggtactact 
ggattaagta ttacgataat tggtattatt gtgatgctac 
ccgttataac gacggctggt atctactatt accggacgga 
gagccggacg ggctcattac tgctaaagtt taaaatatag 
tctcttaatc ccgcaaggtt tcgaccctgc ggggttttgt 
gacttcaatt ataactaaat agtcaacatg attcatgatt 
ttgtggggcg tttatttttt ataaaaattt tttacaaaat 
tacaattata aaaataaata aagccgaaag gcgaggagga 
taaaaaaggc gatgttgtgc tacgagctaa atctcaaacg 
gaaaagaaag cagaccttga atcattagaa gacggaggtg 
gttggtacac aatggaagat gaaactgaac ctaaaaaaga 
tcctgcagtt gctcgacctg ctcgaaaagg tagagtcgtt 
gaaattcctg aagttaagga acagccggaa gaagttggtt 
ctgctcctaa aaaagaaagc gtgatggcga ttactaaggc 
tgcgtctact cgaatcgtca ctcagtctta catcgcctat 
gaaactcgaa aaggtgtttc tattggagtt cgcgcaaaag 
catctattgc tcctgcatct tacgaatggg cgattgacgg 
tgacaccgca atggaattga ttgaagcttc tcacctttct 
cgaaagctag gcgaggtcga accctattta ttgaaacatg 
ggcagaaaag atttccagct tgcccaatgt agtcgagacg 
tatctcaata atgttataga cgctctagat gaatgggagc 
ttcaagacta cattgactct cgaaaccgaa tagcttcttc 
tccattcgcg caccaggttg aatgtttcga atacgcacaa 
caaggtttag ggaaaactaa acaggcaatt gatattgcag 
taatcgtatg ttgcatatca gggctcaaat ggaattgggc 
agctcatatt ttaggaagtc gagtcactaa agatgggaaa 
gaagacttgc ttggtggcca cgacgaattc ttccttatca 
tcattaaata cttaaatgaa ctgacaaaaa gcggagaaat 
gtgtaagaac ccttcaagta agcaaggggc ttcaattcaa 
acaggaactc ctctaatgaa taacccaatc gatgtattca 
atacactgac tcagttcaaa gagcgatact gtatcgtcga 
tctagctgaa cttcgcgagc ttgtcaacga ctacatgctt 
cctgaaaaga ttcgagtcac agagtatgtc gacatgaact 
tgactaaact tgttcaagaa atagataaag tcaagctcat 
tcgacaagcg actggaaatc cttcgatttt aactactcaa 
atcgaaattg tcgaggaatg tatccagcaa ggaaagtcct 
ttgaacctct tgctaagata ctttcgaaga cagtcaaatg 
gttcaacgaa attgaagaat ttatgaatca cagaaaggct 
ggaacaggat ttactttgac gaaagcggat acggttattt 
aggaccaagc cgaagatagg tgtcatagaa teggcgcaaa 
caaaggtact gttgacgaac gtatagaaga ccttattgaa 
gatggtaagc ctatgaaatc taaaattggt aaccttttcg 
tccatattaa ggaaagacac taaaaggaag ccggacagga 
aagagattga tatgtcacct agtgagttag cagagctcct 
tttaaaactc gacaaactgc tcaacaaaga gcaatgctca 
tgaaggaaaa aattggtata aagttggaga gatatgtcaa 
gcttggtatg aagcaaaaga cttcgctgaa gaaaataaca 
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47321 ttcacttccc gtttgttctt cctgaaccta 

473 91 cgaaggcgtg aacaaaccca aacgacctag 

47461 actcttgtag ggaaaactga aagggaagca 

47531 tggagaatta aatgaaaCtt gaagatgaaa 

47601 tgctaccaaa ggcgacatgg agaaacaagt 

47671 aatgacattg aatctgctca aggtaagcac 

47741 acgaagaacg cttgaaagaa attatcgaaa 

47811 actttcaggg cttatcgaat acaagcctgt 

47881 gagattgacc aagaagcaat tcttccagca 

47951 ctaaaattta gcgatatttt tggttctgcg 

4 8021 caggcaaccg ctgtctgcgt taattttaga 

48091 caaagaatag gcaattcagg aaagcctaaa 

48161 gttctacctt attcaagaag gacgtggcaa 

48231 tgaagcactt aacggaaaac aattcgaacc 

48301 gaatttattt tcaatattaa gtgcatcgat 

48371 gaacttattt aaacattgag tcgaacattg 

48441 aaatgttcga aaaagattga acctaagcga 

48511 ttggacgaac tcgaaggaaa aacgggttca 

48581 attttttaaa atgtggttta caaaatgacc 

4 8651 cggtatatat acaccaataa tcgagaaata 

48721 gaaaatttag ctgatagaat atggaagaaa 

48791 agtatttcga acctcaagtg ttagtcgaac 

48861 tcgagcaaat atagtcgaag aagttcgaaa 

48931 gggaaaacta gctgggcggt tcgacttttg 

4 9001 tcgagaaagg aatgtttgta gtgtcagctc 

49071 catgcaagaa tttctcgaac gtttcgagcg 

49141 ggaggttcct caaccaaggc ctcttatcct 

49211 tgtcgactat ttatacgact aattatactg 

49281 tcgtatatat gatacttcag tggttctaga 

49351 attgaatcat agatatagta acatcacaac 

49421 gcggtgtcct attgtgcagg agtgcataat 

49491 agaaagaaaa gtcagccgtc tacttgacag 

49561 caaggaaagt cctctctaca atgaaaaggg 

49631 agcttcaagt cttaaataaa gttctcgaag 

49701 agaatacttc acggattatt tagacgagta 

49771 ccggacgacg aaactattct cgaccatttt 

49841 accttatcga caagctaaaa gaggagcatc 

49911 ggacattcaa gtagatagta acattgcgat 

49981 tctaaattcg taggcggact agacattgct 

50051 gaaaccatga cggtgaaaga cttggaatat 

50121 acttcctggt gaggatttga ttgtcataat 

50191 atgcttgcaa ctgcttggaa gaacgggcat 

50261 ttggtgctcg tatagatact attctttcga 

50331 ccatcagttc gaaaaatatg aggaccatat 

50401 acgcccttta tgattggagg aaagaacctt 

50471 catctgtggt ggggattgac cagctttcac 

50541 ccagtacgcc aacatcacca tggacctata 

50611 gtccaagcag ggcgttcggc taaaactgaa 

50681 atggagtagg tcaaaatgct agcagagtta 

50751 atctgtcgtt aaaaaccgat atggcgaaga 

50821 acctatactc ttataggatt caaagaggaa 

50891 aagcaaaagc ctctaggtcg actgctcgtc 

50961 atgaaagtaa atggtcttca aattgaagcg 

51 031 aagacgaagg aacattcatt tttagacgaa 

51101 tcatgcagga gggactgaaa agcatccctc 

51171 gtgacggaag ctggaacggt tcactgtttc 

51241 atgtattagg tcgaaacgat ggagggttct 

51311 cgaagtagtt aggcaaggcg tcagccctga 

51381 aaaatcattc ctgaagagga acttgataaa 

51451 cggacgagct catcgagatg tttgatgtag 

51521 gaacctcaag ggcgaaacag tattcttcaa 

51591 gatgacccta aaacggaatt tctttatggc 

51661 ctattagtca agtattcgtg actgagtctg 

51731 agtcgctctt atgggagtag gtggaggaaa 

51801 gttctagcac ttgaccctga taacgctggg 

51871 gcaaggtcgt tagatttttg aactacccta 

51941 ggaattatta aattttaatg atctagtctt 

520 11 tttaaaaaga ggtcatatca atatgaaaga 

52081 tggactgacg aagaatgtat caggaacttt 

52151 gttattttgg gatgctttat tcctatgcaa 

52221 tgcattcgag actatttcaa aatgtttggc 

52291 cttacaagac tcttcaagaa tagaatagtc 

52361 attggtatgt agaagtgacg ttcgatagcg 

52431 gacagttggc tattgtgaag actacggaaa 

52501 aatacagagt atgcttatat ctcgtctgtc 

52571 gtgaaattgg agtaagcagg tctgctatta 
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gaacagacct tgaccatcgt ggttctcgac tctgggatga 
ggacaaccta atgcgcggtg acttggcatt ctacactcga 
attcaagaag atgctaaagc atttaaacgt gaacatggat 
aacagttcat cgctgcaatt gaagaagccg gtgaattaaa 
caaaagtctt cgtgatgctc taaaagagta catgaaagaa 
ttttctgcta ccttctacac gacagagcgc tcaactatgg 
aattagttga cgaagccgag acggaagaaa tgtgtgaaaa 
catcaatacg aaacttctcg aggatatgat ttatcacggc 
gttgtcattt ctgttacaga aggcattcgt tttggaaagg 
acgtttttag ggttagcaga atccaatcac accacttgcg 
aggttaatat tataccataa ggaggagata agtggcaagg 
aatgaaattg aactaacatt caaagacaag cctaaaactc 
caggtctttc aaaagtcgag catgattatt ttcaaatagt 
taatatgaag caggtgtcat ctttctttat agttcagtat 
tataactggt tcaacttttc gagcactatg aaaaatgttc 
aactttgtcg atttttagct gaaagttttg ttaaatatga 
aaggttcata acggtctcga ctttcaaaag agcctggatt 
aaattcgaag gattttatta gtttagtaga ctatttttag 
tcaataggcg tataatttat caatcttgat tctttcgggc 
ataaattata gtatcgaaaa tataaaaagg agaaaagttg 
aagttaaatg accttttcga gagaagtggg ctacctcaaa 
gaaaagccga caaggaatgt tgggaatggc tagaagctgt 
cggtcttagc attgttattg cttcgaatac tgtcgggaat 
caacgctatt tagcagaaac tgcacttgac ggaagaattg 
aactattgac tgagttcggc gactataatt attttcaaac 
ccttaagact tgtgagctat tagtcataga cgaaataggt 
tatctgtatg acttggttaa ttatagggtt gacaataact 
acgatgaaat tattgacctt ttaggccaaa ggctttatag 
ttttcaggca agcaatgtaa gaggattgga ggtaagcgaa 
tatttttctt tggcagattg tctttctttg tatttgctgc 
gagcgagagt ctcaagataa ggtgattcaa agttataagc 
tcgatagttc aggagcttgg ctaggaagtg ctccgggagc 
acagcatgta ggaaaattga aagaggtggg agagtgatac 
aaaagagctt atccatttta gaaaataatg gaattgacca 
tcaatttatt caagaacact tttcgagata tggaagagtt 
cctggattcg aatttttcga aattggcgaa actgatgaat . 
tatataattc acttgttcca attttaacgg aagcggctga 
tgcgaatata attccaaaac tagaagaact tttcaatcgc 
cgaaatgcta aacttcgact agactgggcg aatactatta 
cgacagggtt tgaactattg gacgacgtgc ttggaggctt 
ggctcgacct ggacaaggta agtcgtggac tattgataaa 
gatgtccttc tatatagcgg ggaaatgagt gaaatgcaag 
atgttagcat caattcaatt accaaaggga tttggaacga 
tcaagcaatg actgaggctg aaaattccct tgtggtagtc 
acccctgcaa ttttagatag catgatatct aaatatagac 
tcatgagcga gtcttatcca agcagggagc agaagcgaat 
taagatttct gctaaatatg gaattcctat tgtgcttaat 
ggcgctgaaa gtatggaact agaacatata gcagaaagtg 
tcgctatgaa gcgtgacgaa aaatccggca tacttgaact 
ccgaaaaatc atcgaatata tgtgggacgt tgaaactgga 
ggcgaagaag gaactgaaaa aggcgaaagc tctccattga 
ttcgaagtaa ggttacaagg gaaggagttg aagcattttg 
actcctgaac aaataattga aaaactttcg agacaacttg 
ctaagtcgct tggaagcaac tatcaattct catgcccgtt 
ttgtggcatg agtagaaatc cttcttattc aggaagtaag 
acttgcggct acacttcagg actaactgaa ttcgtctcga 
atggaaacca gtggctgaaa aggaattttg gaacatctag 
agcgtttcga agaaatggga gaactgaaaa agtcgagcat 
taccggttta ttcatcctta tatgtatgaa cggaaattga 
gttatgacaa actgcatgat tgcatcacct ttccagtacg 
ccgtcgaagt gttcgttcta agtttcacca gtacggtgaa 
caatatgagc ttgtagcatt tcgagactac tttgaaaaac 
ttatcaactg cttgactctt tggtcaatga agattccagc 
tcaaatcaat ttactaaaac gacttcctta tagaaatatt 
cagacagcgc aggaaaaact ctaccgacag ttaaagcgaa 
aagagttcta tgataataag tgggatataa acgaccatcc 
gtagaaattc atttattatc gtataataaa gttagaaaat 
agcgaataga ctagtttcta gctatgtagg attcgaatgc 
gaactagacc ctgatatgtc aattgcgtct gettatcatc *' 
aaaggtttaa atgcttatct cgacatgaca ttgaaagcat 
aacgttcaaa tcaaaccaag gggccaagtt ttcaacttac 
ttagaatata ggtacctaaa tgcaccttcc atgaatcgaa 
tttcgacaaa tgaagaaggc gacgatttta gtatcctatc 
aattgaaatt gaagcaagtc ttgacttcat gacgctttct 
attcaaaacg gtccttcagt aagcgacgca gaaattgcgc 
gtcagtctaa gaagtcacta aaaaataaat taaaagattt 
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S2641 tatataactg gtttacaaat cacgtgaatt 

52711 aaaaacttca aaaatctttc aaccactaaa 

52781 aaaaatcagg aacatttagc ccagggtcta 

52851 aattgtcacc ctattgtatg atgacccgga 

52921 gttgacggtc gtcgacgcta tatcaattgc 

52991 attgtccatt atgccaaaac ggattccccc 

53061 gggaaaagtt gaaacatggg accgaggccg 

53131 ggaagccttg tgactcagcc ctttgaaatt 

53201 aattccttcc agagcgtccg gaagacagtg 

53271 aactctaatt ttagacctcg acgaagacca 

53341 gagcgttctt caagtcgttc aaactcacgt 

53411 aatcttcaca aggtcgaaca gctgaaagaa 

53481 aggattctaa catgagggcg cgagccctct 

53551 gactctttgg tgcaaagcct cgttctagca 

53621 gaagcctgca gttgaggtta cttacatttc 

53691 ctttcaacta ggattcttgg acacgttctt 

53761 agtatgtaga caaaatgatt gaagacggaa 

53831 tcacgatgag ctggcaggag tctgcttgta 

53901 gttagcaata tgacgaagat gcgaattaag 

S3 971 ggattgtaga ttcaggaatt cctgtcatct 

54041 actcggcgtc aaaatgaatg agccagcgtg 

54111 tctcacagct tgaaaagtct tcactctaaa 

54181 atgacttatt taaaggaatt ccttttagtt 

54251 ccctttgcaa actttcgaac tctatgaatt 

54321 gaatataacc tggaaaaagt ctcatgggtt 

54391 acatggaagt ctacggtgtc gacttagacc 

54461 tatgaacgag gctgagcaag agtttcaaca 

54531 caaactaatt tccagagcta tcaaaaactc 

54601 gccctactca attagcaatt ctgttttatg 

54671 aggaacaggc gaaagtattg tcgagcattt 

54741 tatgcaaaat tagtttcgac ctatacaaca 

54811 ctacattcaa acagtacgga gccaagacag 

54881 ttctcgcggt gagggtgcag tagttcgaca 

54951 gactactctc aacaagaacc tcgttcattg 

S5021 aacaaaacct ggacctatat tcagttatog 

55091 gttctatccc gacggaacga ctaacaagga 

55161 ggtcttatgt acggccgcgg ggctaactca 

55231 aggttattga agatttcttc accgagttcc 

55301 gcaggacttg ggatatgttc aaacagctac 

55371 tacgagttcg agtatatcga cgctagcaag 

55441 agatggacga tactgttcct gaacatatta 

S5511 taagaagaag caagaaatta aagaccaggc 

55581 atagctgatg ctcagcgcca atgtttgaac 

55651 caatgattaa ggtacacaat gacgctgaat 

55721 tgagttacta ggtgaggttc ctatcaagaa 

55791 gaagcagcca aggacattat tagtcttcca 

55861 aagaaattga aatctaaaat ctattcagtt 

55931 tttatttcga acctttaaat gtgaaaggaa 

56001 cctgcttata aatctaataa gcaagtacga 

56071 ttccttacct cgttgatttg ctttatgcaa 

56141 tttggataag tcaaaaagca agtgtcttta 

56211 aaatgctttg tctagcaaca tcggttctat 

56281 cttggaattg gaacggttgc atatatagat 

56351 tcttgcagtc aattgcttcg agatatttga 

56421 tgcttcgaga tatttgaaaa agtagtcagg 

56491 attcattcat tattat 
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tcgtgtatat tatatatgaa aggacaaact ttgaaacctt 
aacttataaa ggagaatcga tatgggaaaa gtatcaattc 
acaacgagtt tttcacactc gctgaccacg gtgacagcgc 
aggcgaagac atggattact tcgtagtcca cgaagcagac 
aatgctattg gcgaagacgg ggaaacagtc catcctgata 
gtactgaaaa actatttctt caactttaca accatgatac 
ttcttatgtt caaaagattg ttacatttat caataaatat 
attcgttcag gagctaaagg tgaccaacga actacttatg 
ctactcttga agattttcca gaaaagagcg aacttcttgg 
aatgtttgac gtggctgacg gcaagttcac tcttcaagaa 
agaggagcat ctcctgcgcc tagacgaggt tccggtcgag 
ctccctcagt tagtcgaaga actcctccaa cacgaggtcg 
ttattattga ttaagaaagg gaaaataatg gcacaaaaag 
agaagaacga tgctcagtta cttgctcaac ggaaaaacag 
aggaaacgct ctaaaggacg cagttgctag agctcgtact 
gacagacttg agttaatcac tgaggaagca aaacccgagc 
taggttctat tgacgtagaa actgatggac tcgatactat 
ctcacctagt caaaaaggaa tctatgctcc tgtcaatcat 
aatcaaattt ctcctgagtt catgaagaaa atgcttcaac 
atcataattc gaaacttgac atgaaaccga tttattggcg 
ggacacatat ttagccgcaa tgcttttaaa tgaaaacgag 
tatgttagga acgaagaaaa cgcagaggtt gcaaaattta 
taattcctcc tgatgttgcc tatatgtatg cggcctatga 
tcaagaacaa tacttgactc caggaactga acaatgtgaa 
cttcataata ttgagatgcc tctaattaaa gttctcttcg 
aagataagct ggcagaaatt agagaacagt ttactgccaa 
gcttgtcagc gaatggcagc ctgaaattga agaacttcga 
gaaatggatg caagaggtcg agtgacggca agcatttcca 
atatcatggg attgaaaagt cctgaaaggg ataaacctag 
tgataacgat atctcaaaag cacttttgaa atatagaaaa 
cttgaccaac accttgcaaa gcctgacaat cgaattcaca 
ggcgtatgtc aagtgagaac cctaacttac agaatattcc 
aatctttgca gccagtgaag ggcattacat tattggtagt 
gcggaattaa gtggcgacga aagtatgcga catgcttacg 
gttcgaaact ttatggtgtt ccctatgaag agtgtttaga 
aggaaaactt cgaagaaatt ctgtcaagtc cgttctttta 
atcgctgagc agatgaatgt atctgtcaaa gaagcgaata 
ctaaagtggc agactatatc atattcgttc aacagcaggc 
cggtcgaaga agaaggcttc ctgatatgag tcttcctcjaa 
aacgaagatt tcgacccctc taactttgac gcagaccaac 
tcgaaaaata ttgggcccag ctagatagag cctggggatt 
aaaagccgaa ggaattctta ttaaggataa cggaggcaag 
tcagttattc aaggaacggc agccgacatg actaagtacg 
tgaaagaatt aggattccat ttaatgattc cagttcacga 
cgcaaaacgg ggagcagaaa ggttgacaga agttatgatt 
atgaaatgtg accccagtat agtagaaaga tggtatggtg 
gcatatataa ttctagtagt tattgcgaac cttgtgacaa 
ttttaattcc tccaagcagt tggtttatgg gattcacttt 
gaagccaaaa tttgcaggct ctttgatatg ggtagggtta 
aacctaccac aatcgcttgt cgtggcttca ggagttgcac 
tattcgacaa gctctcgaat aaattagact cgaagattgc 
tatagacgca accatatgga tttcattagg actgagtcct 
attccgtcag ccgtactagg ccaagttcta gttcagttta 
aaaagtagtc aggaaaattc ctgattatct tgcagtcaat 
aaaattcctg attatttttt ttacaaaaac gcttgacttt 
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Table 30 



Predicted Dp-1 amieo acid sequences 

dplORFOOl 

36698 atgattgacaataatttacctatgagtccaattcctggcgaaattgttcaagtatatgaccaaaacttcaatctaattggagca 

1 MIDNNLPMSPIPGEIVQVYDQNFNLIGA 

36782 agtgatgaaatctttagcaagcattacgaagacgaaattgtgactcgagctcgaggaaaagaaactttcacttttgaaagtatt 

29 SDE I FSKHYEDEIVTRARGKETFTFESI 

36866 gaaacctcatctatctatcaacacttaaaggttgaaaacattatccagtatggaggaagatggtttcgaattaaatatgctcag 

57 ETSSIYQHLKVENIIQYGGRWFRIKYAQ 

36950 gacgtagaagatgtcaaagggcttaccaagtttacctgctacgcattatggtatgaactagcagaaggcttgcctaggaagttg 

85 DVEDVKGLTKFTCYALWYELAEGLPRKL 

37034 aaacacgttgcttcttctgtaggcgctgtcgcgctagatattatcaaagacgcaggtgaatgggtccgactagtttgtcctcct 

113 KHVASSVGAVALDI IKDAGEWVRI#VCPP 

37118 gacggtgctaacaaacaagttcgaagcataacagccgcagaaaattcaatgctttggcatcttcgatatcttgcaaagcaatac 

141 D G A N KQVRS I TAAENSMLWHLRYLAKQY 

37202 aatttagaattgacatttggttatgaagaaattatcaagcaagaggttagaattgttcaaaccgttgtatttcttcagccttat 

169 NLE LTFGYEE I I KQEVRI VQTVVFLQPY 

37286 gtcgagtctaaagtagactttcctcttgtagttgaagagaatttgaaatatgtcactaggcaggaagattctcgaaacctgtgt 

197 VESKVDFPLVVEENLKYVTRQEDSRMLC 

37370 acggcttacaagttgacaggtaaaaaggaagaaggcagtcaagagcctttaacgtttgcttctatcaacaatggaagtgaatat 

225 TAYKLTGKKEEGSQEPLTFAS INNGSEY 

37454 ct cat tgatgtttcgtggtt tact acacgccacatgaagcctcgatatattgctaaatctaaaagcgacgaacattttagaatt 

253 LIDVSWFTTRHMKPRYIAKSKSDEHFRI 

37538 aaagaaaatttgatgagtgctgcgcgtgcttatcttgacatctacagtcgcccactaattggatatgaggcttcagcggtcctt 

281 KENLMSAARAYLDIYSRPLIGYEASAVL 

37622 tataacaaggttcctgacttgcatcatactcaactaattgtcgacgaccattatgatgttatcgagtggcgaaagatatctgct 

309 YNKV PDLHHTQLI VDDHYDVI EWRKISA 

37706 cgaaaaattgactacgacgacctttcaaactctactatcattttccaagaccctcgaaaagacttgatggacttgctaaatgag 

337 RKIDYDDLSNSTI IFQDPRKDLMDLLNE 

37790 gacggcgaaggagtcctttcaggggaaactgtaaatgagtcccaagttgttattagatacgcagatgacattttagggactaat 

365 DGEGVLSGETVNESQVVIRYADDILGTN 

37874 tttaatgcagaatctgggaaatacattggtgtccttaatactaataagaaaccgagcgaattagttcctgacgactttacatgg 

393 FNAESGKYIGVL NTNKKPSELVPDDFT W 

37958 attcgactagaaggtcctaaaggtgacgcaggtttaccgggagctcctgggcgtgatggagtcgacggtgtacctggaaagagc 

421 IRLEGPKGDAGLPGAPGRDGVDGVPGKS 

38042 ggagtagggatagcagatacagctatcacttatgctgtatccgtttccggaacgcaagagcctgaaaatggatggagcgaacaa 

449 GVGIADTAITYAVSVSGTQEPENGWSEQ 

38126 gttcctgaactcataaaaggtcgattcttgtggactaaaacattttggagatatactgacggctcacatgaaactggatactcc 

477 VPELIKGRFLWTKTFWRYTDGSHETGYS 

38210 gttgcctatatagggcaagacggaaattccggaaaagacggaatcgcaggtaaggacggagtaggtatagccgcaactgaagtc 

505 VAYIGQDGNSGKDGIAGKDGVGIAATEV 

38294 atgtatgcaagttcgccatctgctactgaagctccagctggtggatggtctacgcaagttcctaccgtcccaggtggtcagtat 

533 MYASSPSATEAPAGGWSTQVPTVPGGQY 

38378 ttatggactcgaacaagatggcgctacactgaccaaactgatgaaattggatattcagtttcaagaatgggcgagcagggtcct 

561 LWTRTRWRYTDQTDEIGYSVSRMGEQGP 

38462 aaaggtgacgcaggtcgtgacggtattgcaggaaagaacggaatagggttgaagtcaacttcagtttcttatggaattagtccc 

589 KGDAGRDGIA. GKNGIGLKSTSVSYGISP 

38546 actgattctgcgattcctggagtatgggcttcacaagttccttctttaatcaaaggtcaatatctttggactcgaactatttgg 

617 TDSAIPGVWASQVPSLIKGQYLWTRTIW 

38630 acctataccgattcaactaccgaaacgggctatcaaaaaacctacattccaaaagacgggaatgacggtaaaaatggaattgct 

64 5 TYTDSTTETGYQKTYI PKDGNDGKKGIA 

38714 ggtaaggatggggtaggaattaagtctacgaccattacctacgcaggctcaacctcaggaacagttgcgcctacttcaaattgg 

673 GKDGVGI KSTTITYAGSTSGTVAPTSNW 

38798 acttctgctattccaaatgttcaaccgggattcttcttgtggacgaaaactgtttggaactatactgatgacactagcgaaaca 

701 TSAI PNVQPGFFLWTKTVWNYTDDTSET 

38882 ggttactcagtttccaagataggtgaaacaggtcctagaggagttcaaggtcttcaaggtcctcaagggcttcaaggaattcct 

729 GYSVSKIGETGPRGVQGLQGPQGLQGIP 

38966 ggacctgcaggagctgacggacgttcgcaatatactcacctcgctttctctaatagtccaaacggtgagggatttagtcatact 

757 GPAGADGRSQYTHLAFSNSPNGEGFSHT 

39050 gacagcggacgagcatacgtcggtcagtatcaagatttcaatcccgtccattcaaaagaccctgcagcctatacatggacgaaa 

785 DSGRAYVGQYQDFNPVHSKDPAAYTWTK 

39134 tggaaggggaatgacggagctcaagggatacccgggaagccaggcgcagacggtaagactaattatttccatatagcttacgct 

813 W K G N D G A Q G I PGKPGADG K T N Y FKI AYA 

39218 tcaagtgcagacggatcacgtgagttcagtttggaagataataatcaacaatatatgggttattactccgattatgagcaa^ca 

841 SSADGSREFSLEDNNQQYMGYYSDjnTQA 

39302 gatagcagggatcgaactaagtatcgatggtttgaccgccttgccaatgttcaagtgggaggtcgaaacgagttccttaattct 

869 DSRDRTKYRWFDRLANVQVGGRNEFLNS 

39386 ttatttgaatttggtttaaaaccccgctattctagttacaatctaatggacggacaagatcaaacgcaaggacagatatctgct 

897 LFEFGLKPRYSSYNLMDGQDQTQGQISA 

39470 actattgacgaacgtcaacggttcaaaggtgctaactctttacgacttgactcaacatggaacggtaaaccgcagaaccaaaaa 

925 TIDERQRFKGANSLRLDSTWMGKPQNQK 
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39554 ctgaccttttccctaggaggagatacgcgactaggtaccccaaccgagtggtctaatttagaaggccgtatcagtttctgggct 

953 LTFS LGGDTRLGTPTEWSNLEGR IS FWA 

39638 aaggcctctaggaacggagtgagcttagctgcacggccgggttatcgtagtaacgtatttaccgcaaccttaaccgatcaatgg 

981 KASRNGVSLAARPGYRSNVFTATLTDQW 

39722 aagttctacgattttaaattctctgacaaagtcaattcaaattgtaccgctgaagcaattttccatgtattcactcaaagttgt 

1009 KFYDFKFFDKVNSNCTAEAIFHVFTQSC 

39606 tcagtgtggctcaatcatattaaaatcgaacttggtaatatctctactccttttagtgaagcagaggaagaccttaaatatcga 

1037 SVWLNHIKIELGNISTPFSEAEEDLKYR 

39890 attgactcaaaagccgatcaaaagctaactaaccaacagctgacggcactcacggaaaaggctcaactacacgacgcagaactg 

1065 IDS KADQKLTNQQLTALTE KAQ LHDAEL 

39974 aaagctaaggctacaatggagcagttaagtaacttagaaaaggcttatgaaggtagaatgaaagctaatgaagaagctatcaaa 

1093 KAKATMEQLSNLEKAYEGRMKAN EEAIK 

40058 aaatcggaagccgacctaatcttagcggcaagtcgaattgaagctactatccaagaacttggcgggctacgggaactgaagaag 

1121 KSEADLI LAASRI EAT I QELGG LRELKK 

40142 ttcgtcgacagttacatgagctcttctaatgaaggtctaattatcggtaagaacgacggtagctctaccattaaggtatcaagt 

1149 FVDSYMSSSNEGLIIGKNDGSSTIKVSS 

40226 gaccgaatttctatgttctccgcagggaatgaagttatgtaccttacgcaagggttcattcacatcgataacgggatctttacc 

1177 DRISMFSAGNEVMYLTQGFIHIDNGIFT 

40310 caatccattcaagtcggccgatttagaacggaacaatactcgtttaatccagacatgaacgtgattcggtatgtaggataa 
40390 

120S QSIQVGRFRTEQYSFNPDMNVIRYVG* 
dplORF002 

32386 atggattttgggtcaattgcagcaaaaatgactttggatatctcaaacttcacaagtcaattaaatcttgctcaaagtcaagcg 

1 MDFGS IAAKMTLDISNFTSQLNLAQSQA 

32470 caacggctcgcactagagtcttcgaagtcctttcaaattggttctgctttaacaggattagggaaaggacttacgactgcggtt 

29 QRLALESSKSFQIGSALTGLGKGLTTAV 

32554 acccttcctcttatgggatttgcagccgcctctattaaagtagggaatgaattccaagctcaaatgtcccgtgttcaagctatt 

57 TLPLMGFAAASIKVGNEFQAQMSRVQAI 

32638 gcaggagcgacagcggaagagcttggtagaatgaagactcaagcaatcgaccttggtgctaaaactgcttttagtgcaaaagag 

85 AGATAEELGRMKTQAI DLGAKTAFSAKE 

32722 gcggctcaaggtatggaaaatctagcttcagccggtttccaggtaaatgaaatcatggacgctatgccaggggtacttgacctg 

113 AAQGMENLASAGFQVNEIMDAMPGVLDL 

32806 gctgccgtatctggaggagatgtggccgcgagctccgaggccatggctagttcacttcgagcctttggattagaggcaaaccag 

141 AAVSGGDVAASSEAMASSLRAFGLEANQ 

32890 gcgggtcacgtggctgacgtatttgctcgagcagcagctgatacgaacgcagaaactagcgacatggcagaggcgatgaaatac 

169 AGKVADVFARAAADTNAETSOMAEAMKY 

32974 gtcgcacccgttgctcactctatgggcttgagccttgaagaaacggctgcgtctattgggattacggccgacgccggtattaag 

197 VAPVAHSMGLSLEETAAS ZG I MADAGI K 

33058 ggctcgcaagccggaaccacgcttagaggcgctctctcgcgtattgccaaacctacgaaagcgatggtcaaatcaatgcaggaa 

225 GS Q A G T T LRGAL S R IAKPT KAMVKSMQE 

33142 ttaggagtttcgttctacgacgcgaacggaaacatgattccactaagagaacaaatcgctcaactgaaaacagctactgcagga 

253 LGVSFYDANGNMIPLREQIAQLK TATAG 

33226 ctaacacaagaggaacgaaatcgtcaccttgttaccttgtatggccaaaactcgttgtcaggtatgcttgcactattagacgca 

281 LTQEERNRHLVTLYGQNS LSGMLALLOA 

33310 ggtcctgagaaattggataagatgaccaatgctctcgtgaactcggacggagctgctaaggaaatggcagaaactatgcaggac 

309 GPEKLDKMTNALVNSDGAAKEMAETMQD 

33394 aaccttgctagtaaaatcgagcaaatgggaggagctttcgagtctgttgctattattgttcaacaaatccttgagcctgcactt 

337 NIiASKIEQMGGAFESVAI X V Q Q I LEPAL 

33478 gctaaaatcgtgggagcaatcacaaaagttctcgaagcattcgtaaatatgtcacctatcggtcaaaagatggttgtcatattc 

365 AKIVGAITKVLEAFVNMSPIGQKMVVIF 

33562 gcaggaatggttgcagcccttggaccactgcttctaattgcaggaatggtgatgacaactattgtcaagttaagaattgctatt 

393 AGMVAALGPL LLIAGMVMTTI VKLRIAI 

33646 cagtttttaggtccagcatttatgggaacgatgggaaccattgcaggagttatagcaatattctatgctctggtcgccgtgttc 

421 QFLGPAFMGTMGTIAGVIAI FYALVAVF 

33730 atgatagcctacacaaaatcggagagatttagaaactttatcaacagtcttgcgcctgctattaaagctgggtttggaggagcg 

449 MIAYTKSERFRNFINSLAPAI KAGFGGA 

33814 ttggaatggctacttccacgactgaaagagttaggagaatggttacagaaggcaggcgagaaggcgaaagagttcggtcagtct 

477 LEWLLPRLKELGEWLQKAGEKAKEFGQS 

33898 gtagggtccaaagtgtcaaaactgctcgaacagtttggaataagtatcggtcaggcaggaggctcgattggtcagttcattgga 

505 VGS KVSKLLEQFGIS IGQAGGS IGQFIG 

33982 aatgttctcgaaaggctaggaggcgcatttggaaaagtaggaggagtcatttcaattgctgtttcacttgtaacaaaattcggt 

533 NVLERLGGAFGKVGGVISIAVSLVTKFG 

34066 ctcgcatttctagggattacaggaccacccgggattgctattagtctgttagtttcatttttgacagcttgggctagaacaggt 

561 LAFLGITGPLGIAISLLVS FIiTAWARTG 

34150 gagttcaacgcagacggaattactcaagtattcgaaaacttgacaaacacaattcagtcgacggctgatttcatctctcaatac 

589 EFNADGI TQVFENLTNTI QSTADFI SQY 

34234 cttccagtctttgtcgaaaaaggaactcaaattttagttaagattattgaaggaattgcatctgctgcccctcaagtagttgaa 

617 LPVFVBKGTQ. ILVKII EGIASAVPQVVE 

34318 gtgatttcacaagtcattgaaaacattgtgatgacaatttcgacagttatgcctcaattagccgaagcaggaattaaga.taCCC 

645 VISQVIENIVMTISTVMPQLVEAGI ~~K" I L 

34402 gaagcgcttataaatggtcttgttcaatctcttcctactatcattcaagcagctgttcaaattatcactgctttattcaatggt 

673 EALINGLVQSLPTIIQAAVQI I T A L F N G 

34486 cttgttcaggcacttcctacgcttattcaagcaggccttcaaattttgtcagctctcataaacggactagttcaagcgcttccg 

701 LVQALPTLIQAGLQILSALINGLVQALP 

34570 gcaattattcaagcagctgttcaaattatcaegtcgcttgttcaagcactaattgaaaacttgcctatgataatcgaagcagcg 

729 AIIQAAVQIIMSLVQALIENLPMIIEAA 
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34654 acgcagatcataatgggtctagccaacgcactgattgaaaatataggacctaccttagaagcagggattcaaattctaatggct 

757 MQIIMGLVNALIENIGPILEAGIQILMA 

34738 ttaatcgagggacctattcaagtgcttcctgaactaattacagcagcgattcaaatcattacttcactatcagaagcaatcttg 

785 LIEGLIQVLPELITAAIQI ITSLLEAIL 

34822 tcgaaccttcctcaacttctagaagccggagttaaattgcttttatcacttcttcaagggttgctaaatatgcttcctcaacta 

813 SNLPQLLEAGVKLLLSLLQGLLNMLPQL 

34906 attgcaggggctttgcaaatcatgatggcacttcttaaagcagttatcgacttcgtccctaaacttcttcaagcaggtgttcaa 

841 IAGALQIMMALLKAVIDFVPKLLQAGVQ 

34990 cttcttaaggcattgattcaaggtattgcttcactcctcggctcacttttatcgacagctggaaacatgctttcatcattagtt 

869 LLKALIQGIASLLGSLLSTAGNMLSSLV 

35074 agcaagattgctagctttgtgggacagatggtttcaggaggtgcgaacctgattcgaaacttcattagtggtattgggtcaatg 

897 SKI ASFVGQMVSGGANLI RNFI SGIGSM 

35158 attggttcagctgtctctaaaattggcagcatgggaacttcaattgtttctaaggttactggattcgctggacaaatggtaagc 

925 IGSAVSKIGSMGTSI VSKVTGFAGQMVS 

35242 gcaggggtcaaccctgttcgaggatttatcaatggtatcagttccatggtaagttctgcggtaagtgcggcggctaatatggct 

953 AGVNLVRGFINGI SSMVSSAVSAAANMA 

35326 agcagtgcattaaatgccgttaagggattcttaggtattcactctccttcacgtgtcatggagcagatgggtatctatacgggt 

981 SSALNAVKGFLGIHSPSRVMBQMGIYTG 

35410 caagggttcgtaaatggtattggtaacatgattcgaactacacgtgacaaggctaaagaaatggctgaaactgttactgaagct 

1009 QGFV NGIGNMIRTTRDKAKEMAETVTEA 

35494 ctcagcgacgtgaagatggatattcaagaaaatggagttatagaaaaggttaaatcagtttacgaaaagatggctgaccaactt 

1037 LSDVKMDIQENGVI EKVKSVYE KMADQL 

35578 cctgaaacccttccagctcctgatttcgaagatgttcgtaaagcagccggttcgcctcgagtggacttgttcaatacaggaagt 

1065 PETLPAPDFEDVRKAAGSPRVDLFNTGS 

35662 gacaaccctaaccaacctcagtcacaatctaaaaacaatcaaggcgagcaaaccgttgtcaacattggaacaatcgtagttcga 

1093 DNPNQPQSQSKNNQGEQTVVNI GT IVVR 

35746 aacaatgacgacgttgacaaactgtcgagaggattgtataatagaagtaaagaaactctatcagggtttggtaacattgtaaca 

1121 NNDDVDKLSRGLYNRSKETLSGFGNIVT 

35830 ccgtaa 35835 

1149 P * 
dplORF003 

53538 atggcacaaaaaggactctttggtgcaaagcctcgttctagcaagaagaacgatgctcagttacttgctcaacggaaaaacagg 

1 MAQKGLFGAKPRSSKKMDAQLLAQRKKR 

53622 aagcctgcagttgaggttacttacatttcaggaaacgctccaaaggacgcagttgctagagctcgtactctttcaactaggatt 

29 KPAVEVTYI SGNALKDAVARARTLSTRI 

53706 cttggacacgttcttgatagacttgagttaatcactgaggaagcaaaactcgagcagtatgtagacaaaatgattgaagacgga 

57 LGHVLDRLELITEEAKLEQYVDKMIEDG 

53790 ataggttctattgacgtagaaactgatggactcgatactattcacgatgagctggcaggagtctgcttgtactcacctagtcaa 

85 IGSIDVETDGLDTIHDELAGVCLYSPSQ 

53874 aaaggaatctatgctcctgtcaatcatgttagcaatatgacgaagatgcgaattaagaatcaaatttctcctgagttcatgaag 

113 KGIYAPVNHVSNMTKMRIKNQI SPEFMK 

53958 aaaatgcttcaacggactgtagattcaggaattcctgtcatctatcataattcgaaatttgacatgaaatcgatttattggcga 

141 KMLQRIVDSGI PVIYHNSKFDMKS IYWR 

54042 ctcggcgtcaaaatgaatgagccagcgtgggatacatatttagccgcaatgcttttaaatgaaaacgagtctcacagcttgaaa 

169 LGVKMNEPAWDTYLAAMLLNENESHSLK 

54126 agtcttcactctaaatatgttaggaacgaagaaaacgcagaggttgcaaaatttaatgacttatttaaaggaattccttttagt 

197 SLHSKYVRNEENAEVAKFNDLFKGIPFS 

54210 ctaattcctcctgatgttgcctatatgtatgcggcctatgaccctttgcaaactttcgaactccatgaatttcaagaacaatac 

225 LI PPDVAYMYAAYDPLQTFELYEFQEQY 

54294 ttgactccaggaactgaacaatgtgaagaatataacctggaaaaagtctcatgggttcttcataatattgagatgcctctaatt 

253 LTPGTEQCEEYNLEKVSWVLHNI EMPLI 

54378 aaagttctcttcgacatggaagtctacggtgtcgacttagaccaagataagctggcagaaatcagagaacagtttaccgccaat 

281 KVLFDMEVYGVDLDQDKLAEIREQFTAN 

54462 atgaacgaggctgagcaagagtttcaacagcttgtcagcgaatggcagcctgaaattgaagaacttcgacaaactaatttccag 

309 MNEAEQEFQQLVSEWQPEIEELRQTNFQ 

54546 agctatcaaaaactcgaaatggatgcaagaggtcgagtgacggtaagcatttccagtcctacccaattagcaattctgttttat 

337 SYQKLEMDARGRVTVS I SSPTQLAI LFY 

54630 gatatcatgggattgaaaagtcctgaaagggataaacctagaggaacaggcgaaagtattgtcgagcattttgataacgatatc 

365 DIMGLKSPERDKPRGTGESIVEHFDNDI 

54714 tcaaaagcacttttgaaatatagaaaatatgcaaaattagtttcgacctatacaacacttgaccaacaccttgcaaagcctgac 

393 SKALLKYRKYAKLVSTYTTLDQHLAKPD 

54798 aatcgaattcacactacattcaaacagtacggagctaagacagggcgtatgtcaagtgagaatcctaacttacagaacattcct 

421 NRIHTTFKQYGAKTGRMSSENPNLQNIP 

54882 tctcgcggtgagggtgcagtagttcgacaaatctttgcagccagtgaagggcattacattattggtagtgactactctcaacaa 

449 SRGEGAVVRQI FAASEGHYI IGSDYSQQ 

54966 gaacctcgttcattggcggaattaagtggcgacgaaagtatgcgacatgcttacgaacaaaacctggacctatattcagttatc 

477 EPRSLAELSGDESMRHAYEQNLDLYSVI 

S50S0 ggttcgaaactttatggtgttccccatgaagagtgtttagagttctatcccgacggaacgactaacaaggaaggaaaacttcga 

505 GSKLYGVPYE ECLEFYPDGTTNK-E G JZ- -4, R 

55134 agaaattctgtcaagtccgttcctttaggtcttatgtacggccgcggggctaactcaatcgctgagcagatgaatgtacctgtc 

533 RNSVKSVLLGLMYGRGANSIAEQMNVSV 

55218 aaagaagcgaataaggttattgaagatttcttcaccgagttccctaaagtggcagactatatcatattcgttcaacagcaggcg 

561 KEANKVIEDFFTEFPKVADYI I FVQQQA 

55302 caggacttgggatatgttcaaacagctaccggtcgaagaagaaggcttcctgatatgagtcttcctgaatacgagttcgagtat 

589 QDIiGYVQTATGRRRRLPDMSLPEYEFEY 

55386 atcgacgctagcaagaacgaagacttcgacccctttaactttgacgcagaccaacagatggacgatactgttcctgaacacatt 
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617 IDASKNEDFDPFNFDADQQMDDTVPEHI 

55470 atcgaaaaatattgggcccagctagatagagcctggggatttaagaagaagcaagaaattaaagaccaggcaaaagccgaagga 

645 IEKYWAQLDRAWGFKKKQEIKDQAKAEG 

55554 attcttattaaggacaacggaggcaagatagctgatgctcagcgccaatgtttgaactcagttattcaaggaacggcagccgac 

673 ILIKDNGGKIADAQRQCLNSVIQGTAAD 

55638 atgactaagtacgcaatgattaaggtacacaatgacgctgaattgaaagaattaggattccatttaatgattccagttcacgat 

701 MTK YAMI KVHNDAELKELG FHLM I PVHD 

55722 gagttactaggtgaggttcctatcaagaacgcaaaacggggagcagaaaggttgacagaagttatgattgaagcagccaaggac 

729 BLLGEVPI K NAKRGAERLTEVM I E A A K D 

55806 attattagtcttccaatgaaatgtgaccccagtatagtagaaagatggtatggtgaagaaattgaaatctaa 55877 

757 iiSLPMKCDPSIVERWYGEEIEI* 
dplORF004 

40401 atgacaaaatttatcaactcatacggccctcttcacttgaacctttacgtcgaacaagttagtcaggacgtaacgaacaactcc 

1 MTKFINSYGPLHLNLYVEQVSQDVTNNS 

40485 tcgcgagttagttggcgagctactgtcgaccgcgatggagcttatcgaacgtggacttatggaaatattagtaacctttccgta 

29 SRVSWRATVDRDGAYRTWTYGNISNLSV 

40569 tggttaaatggttcaagtgttcatagcagtcacccagactacgacacgtccggcgaagaggtaacgctcgcaagtggagaagtg 

57 WLNGSSVHSSHPDYDTSGE EVTLASGEV 

40653 actgttcctcacaatagtgacgggacaaagacaatgtccgtttgggcttcgtttgaccctaataacggcgttcacggaaatatc 

85 TVPHNSDGTKTMSVWASFD PNNGVHGNI 

40737 actatctctactaattacactttagacagtattccaaggtctacacagatttctagttttgagggaaatcgaaatctaggatct 

113 TISTNYTLDSIPRS TQISSFEGNRNLGS 

40821 ttacatacggttatctttaaccgaaaagtgaactcttttacgcatcaagtttggtaccgagttttcggtagcgactggatagat 

141 LHTVIFNRKVNSFTHQVWYRVFGSDWID 

40905 ttaggtaagaaccatactactagcgtatcctttacgccgtcactggacttagcaaggtacttacctaaatcaagttccggaaca 

169 LGKNHTTSVSFT P S LDLARY LP KS S SGT 

40989 atggacatctgtattcgaacctataacggaactacgcaaattggtagtgacgtctattcaaacggatggaggttcaacatcccc 

197 MDICIRTYNGTTQIGSDVYSNGWRFNIP 

41073 gattcagtacgtcctactttttcgggcatttctttagtagacacgacttcagcggttcgacagattttaacagggaacaacttc 

225 DSVRPTFSGISLVDTTSAVRQILTGNNF 

41157 ctccaaatcatgtcgaacattcaagccaacttcaacaatgcttccggcgcttacggatccactatccaagcatttcacgctgag 

253 LQIMSNIQVNFNNASGAYGSTIQAFHAE 

41241 ctcgtaggtaaaaaccaagctatcaacgaaaacggcggcaaattgggtatgatgaactttaatggctccgctaccgtaagagca 

2B1 LVGKNQAINENGG KLGMMNFKGSATVRA 

41325 tgggttacagacacgcgaggaaaacaatcgaacgtccaagacgtatctatcaatgttatagaatactatggaccgtctatcaat 

309 WVTDTRGKQSNVQDVSINVIEYYGPSIN 

41409 ttctccgttcaacgtactcgtcaaaatcctgcaattatccaagctcttcgaaatgctaaggtcgcacctataacggtaggaggt 

337 FSVQRTRQNPAI I QALRNAKVAP ITVGG 

41493 caacagaaaaacatcatgcaaattaccttctccgtggcgccgttgaacactactaatttcacagaagatagaggttcggcgtca 

365 QQKN I MQ I TFSVA P LNTTN FTEDRGS'AS 

41577 gggacgttcactactatttccctaatgactaactcgtccgcgaacttagctggtaactacgggccggacaagtcttacatagtt 

393 GTFTTISLMTNSSANLAGNYGPDKSYIV 

41661 aaggctaaaatccaagacaggttcacttcgactgaatttagtgctacggtagctaccgaatcagtagttcttaactatgacaag 

421 KAKI QDRFTSTEF SATVATESVVLNYDK 

41745 gacggtcgacttggagttggtaaggttgtagaacaagggaaggcagggtcaattgatgcagcaggtgatatatatgctggaggt 

449 DGRLGVGKVVEQGKAGSIDAAGDIYAGG 

41829 cgacaagttcaacagtttcagctcactgataataatggagcattgaacaggggtcaatataacgatgtttggaataagcgtgaa 

477 RQVQQFQLTDNNGALNRGQYNDVWNKRE 

41913 acagagtttacatggcgaagtaacaaatacgaggacaaccctacgggaacccgaggtgaatggggactatttcaaaatttctgg 

505 TEFTWRSNKYEDNPTGTRGEWGLFQNFW 

41997 ttagatagctggaaaatggttcaatccttcattacaatgtcaggaagaatgttcatcaggacagcgaacgatggaaacagctgg 

533 LDSWKMVQSFITMSGRMFIRTANDGNSW 

42081 agacctaacaagtggaaagaggttctatttaagcaagacttcgaacagaataattggcagaaacctgttcttcaaagtgggtgg 

561 RPNKWKEVLFKQDFEQNNWQKLVLQSGW 

42165 aaccatcactcaacctatggcgacgcattctattcgaaaactcttgacggcatagtatatttgagaggaaatgtgcataaagga 

589 MHHSTYGDAFYSKTLDGIVYLRGNVHKG 

42249 cttatcgacaaagaggctactattgcagtacttcctgaaggatttagaccgaaagtttcaatgtatcttcaggctctcaataac 

617 LIDKEATIAVLPEGFRPKVSMYLQALNN 

42333 tcatatggaaatgccattctatgtatatacactgacggaagacttgtggtgaaatcgaatgtagataattcttggttaaattta 

645 SYGNAILCIYTDGRLVVKSNVDNSWLNL 

42417 gacaatgtctcatttcgtatttaa 42440 

673 DNVSFRI * 
dplORPOOS 

23674 atggctaaaaaatcaaaagctatctcacacacagacgaactgattagtcagtcgtttgacagccccttggcaaagaatcaaaag 

1 MAKKSKAISHTDELI SQSFDSPLAKNQK 

23758 ttcaagaaagagcttcaggaagttgaaaagtattatcaatacttcgacggatttgatgtcacggacttgaatactgactatggg 

29 FKKE LQEVEKYYQYFDGFDVTDLNTDYG 

23842 caaacatggaagattgacgaagactcagtcgactataaacctactcgagaaattcgaaactatattcgacaacttatcaaaaag 

57 QTWK I DEDSVDYKPTRE I RNY I R"Q L_I- -K Y 

23926 caatcacgctttatgatgggtaaagagccagagcttatctttagtccagttcaagacaatcaagatgaacaggctgagaacaag 

85 QSRFMMGKEPELI FS PVQDNQDEQAENK 

24010 cgtattctattcgactctattttaaggaattgtaaattctggagcaaaagtacaaatgcattagtcgacgccacagtaggtaag 

113 RILFDSILRMCKFWSKSTNALVDATVGK 

24094 cgggtattgatgacagtagtagcaaatgccgctcaacaaattgacgtccagttttattcaatgcctcagttcacctatacagtt 

141 RVLMTVVANAAQQI DVQFYSMPQFTYTV 

24178 gaccctagaaacccttccagcttgctttctgttgacattgtttatcaggacgagcgcacaaaaggaatgagcactgaaaaacaa 
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169 DPRNPSSLLSVDIVYQDERTKGMSTBKQ 

24262 ctttggcatcattatagatatgaaatgaaagctggaacaagtcaatcaggaattgcaacagctttagaagacactgaagaacaa 

197 LWHHYRYEMKAGTSQSGIATALEDIEEQ 

24346 tgttggctcacttatgccttaacggatggagagtcgaaccaaatctatatgacagaaagtggccaaactactatcaaggagaca 

225 CWLTYALTDGESNQIYMTESGQTTIKET 

24430 gaggctaaacttgtagaaattgaagacaacctaggaaacaagattgaagttcctttaaaagttcaagaatccgccccaaccggc 

253 EAKLVEIEDNLGNKI EVPLKVQESAPTG 

24514 ttgaagcaaattccttgtcgagttattcttaatgaaccattgactaatgacatatacgggacaagcgatgtcaaagaccttatc 

281 LKQI PCRVI LNEPLTNDIYGTSDVKDLI 

24598 acagtagcagataacttgaacaaaactattagtgacttacgagattcacttcgatttaaaatgttcgagcagcctgttatcatt 

309 TVA DNLNKTI SDLRDS LR F KM P EQPVI I 

24682 gatggctcttctaagtcaattcaaggaatgaagattgcgccaaacgctttggtcgaccttaagagtgaccctacttcctcaatc 

337 DGSSKSIQGMKIAPNALVDLKSDPTSSI 

24766 ggcggtactggaggcaagcaagctcaagtcacttccatttcaggaaacttcaacttccttccagcggctgaatattatttagag 

365 GGTGGKQAQVTSISGNFNPLPAAEYYLE 

24850 ggcgctaagaaagccatgtatgaactaatggaccagccaatgcctgaaaaggtacaggaggcgccatcaggaattgcaatgcag 

393 GAKKAMYELMDQPMPEKVQEAPSGIAMQ 

24934 tccttattctacgacctaatttctcgatgtgacggaaaatggattgagtgggatgatgctattcaatggctcattcaaatgctg 

421 FLFYDLISRCDGKWI EWDDAIQWLIQML 

25018 gaagaaattttagcaacagtgaatgttgacttgggaaatattcctcaagatattcaatcaagttatcaaacacttacgacaatg 

449 EEI LATVNVDLGNIPQDIQSSYQTLTTM 

25102 actatcgaacaccactatccaattcctagcgatgaactttctgctaagcaacttgcgctcactgaagttcaaactaatgtacgc 

477 TIEHHYPI PSDELSAKQLALTEVQTNVR 

25186 agccaccaatcttacattgaagaattcagtaagaaggaaaaggcggacaaggaatgggaacgcattttggaagaacttgctcag 

505 SHQSYIEEFSKKEKADKBWERI LEELAQ 

25270 cttgacgaaatctcagctggagcattgcctgtattagcaaacgaattaaacgaacaagaggagcctcaagatgaaacgagtgaa 

533 L D E ISAGALPVLANELNEQEEPQDETSE 

25354 gaagacgaagttgatgacaaagaaaaagaacaaactgaacaaccaaccgaagaaggagtcgacccagacgttcaaggttaa 
25434 

561 EDEVDDK EKEQTEQPTE EGVDPDVQG* 

dpioRPOoe 

45296 atgattgaaatcgttatagcacgttcgaaagctaggcgaggtcgaaccctatttattgaaacatgggcaagcactgatgaagat 

1 MIEIVIARSKARRGRTLFIETWASTDED 

45380 gcagttaaaatggcagaaaagatttccagctcgcccaatgtagtcgagacgtcttctaataacttcgaactaccttataagtat 

29 AVKMAEKI SSLPNVVETSSNNFELPYKY 

45464 ttcaataatgttatagacgctctagatgaatgggagcttcacatcttcggcgaacttgataaagatgttcaagactacattgac 

57 FNNVIDALDEWBLHIFGELDKDVQDYID 

45548 tctcgaaaccgaatagcttcttcaagcaacgagcagttttcgttcaagactactccattcgcgcaccaggttgaatgtttcgaa 

85 SRNRIASSSNEQFSFKTTPFAHQVECPE 

45632 tacgcacaagagcatccatgtttccttttaggcgatgagcaaggtttagggaaaactaaacaggcaattgatattgcagttagc 

113 YAQEHPCFLLGDEQGLGKTKQAIDIAVS 

45716 aggaaggcaagtttcaaacattgtttaatcgtatgttgcatatcagggctcaaatggaattgggcaaaagaagtaggtattcat 

141 RKASFKHCLIVCCISGLKWNWAKEVGIH 

45800 tcaaatgagtcagctcatattttaggaagtcgagtcactaaagatgggaaattagtgattgacggagtttctaaacgggcagaa 

169 SNESAHILGSRVTKDGKLVIDGVSKRAE 

45884 gacttgcttggtggccacgacgaattcttccttatcactaacattgaaactcttcgcgatgctgtgttcattaaatacttaaat 

197 DLLGGHDEFFLITNIETLRDAVFIKYLN 

45968 gaactgacaaaaagcggagaaattggaatggttattattgacgagattcacaagtgtaagaacccttcaagtaagcaaggggct 

225 ELTKSGEIGMVI IDEIHKCKNPSSKQGA 

46052 tcaattcaaaagctccaaagttattacaagatgggacttacaggaacccctctaatgaataacccaatcgatgtattcaatgct 

253 SIQKLQSYYKMGLTGTPLMNNPIDVFNV 

46136 atgaagcggctaggggcggaacatcatacactgactcagttcaaagagcgatactgtatcgtcgaccagttcaatcaaatcact 

281 MKWLGAEHHTLTQFKERYCIVDQFNQIT 

46220 ggatatcgaaatctagctgaacttcgcgagcttgtcaacgactacatgcttagaagaacgaaggaagaagttttagacctgcct 

309 GYRNLAELRELVNDYMLRRTKEEVLDLP 

46304 gaaaagattcgagtcacagagtatgtcgacaegaactcgaaacagtcaaaaatctataaggaagttttgactaaacttgttcaa 

337 EKIRVTEYVDMNSKQSKIYKEVLTKLVQ 

46388 gaaatagataaagtcaagctcatgcctaaccctctagccgaaacgattcgacttcgacaagcgactggaaatccttcgatttta 

365 EIDKVKLMPNPLAETIRLRQATGNPSIL 

46472 actactcaagatgtcaagtcttgcaagttcgaaagatgtatcgaaattgtcgaggaatgtatccagcaaggaaagccctgcgtg 

393 TTQDVKSCKFERCIEIVEECIQQGKSCV 

46556 atatttagcaattgggaaaaggttattgaacctcttgctaagatactttcgaagacagtcaaatgcaacctggtaacaggagaa 

421 IFSNWEKVIEPLAKILSKTVKCNLVTGE 

46640 accgcagataagttcaacgaaattgaagaatttatgaatcacagaaaggcttctgttattttaggaactataggtgcgctagga 

449 TADKFNEI EEFMNHRKASVILGTIGALG 

46724 acaggatttactttgacgaaagcggatacggttattttcttagatagtccgtggacacgcgcagaaaaggaccaagccgaagat 

477 TGFTLTKADT. VIFLDSPWT RAEKD Q A E D 

46808 aggtgtcatagaattggcgcaaaaagttctgtcactatctacacgcttgtcgccaaaggtactgttgacgaacqtatagaaga^c 

505 RCHRIGAKSSVTIYTLVAKGTVDE^RIED 

46892 cttattgaacggaaaggagaattagcagattatatcgtagatggtaagcctatgaaatctaaaattggtaaccttttcgatatc 

533 LIERKGELADYIVDGKPMKSKIGNLFDI 

46976 ctgcccaaatag 46987 

561 L L K * 
dplORF007 

22230 atgacaataagcctgagaaataaactacctaagttcaacttcgtcccttttagtaagaaacaaccccagctcctaacatggcgg 
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1 M T I SLRNKLPKFNF VPFSKKQLQLLTWW 

22314 acaaagggctcaccttttcgaactctcgatatcgtcatagcagacggttccattcgttcaggaaaaacagtatcgatggctctt 

29 TKGSPFRTFDIVIADGSIRSGKTVSMAL 

22398 tcattttccctttgggccatgacggaattcaacggacaaaactttgccatctgtggtaagacaattcactcagctcgacgaaat 

57 SFSLWAMTEFNGQNFAICGKTIHSARRN 

22482 gtcattcagcctctaaagcaaatgctcacaagtcgcgggtatgaaattcgagatgttcgaaatgaaaatctacttattattaga 

85 VIQPLKQMLTSRGYEIRDVRNENLLIIR 

22566 cactttagaaatggcgaagaaattgtcaactacttctatatatttggaggaaaagatgagtcgagtcaagaccttatacagggg 

113 HFRNGEE I VNYFYI FGGKDESSQDLIQG 

22650 gtaacattagcaggtatcttctgtgatgaggtggcactgatgcctgaatcgtttgtcaaccaagcgacagggcgctgttccgta 

141 VTLAG I FCDEVALMPESFVNQATGRCSV 

22734 acaggttcgaaaatgtggttctcttgtaacccggccaatcctaatcactacttcaagaagaactggattgacaaacaggtcgaa 

169 TGS KMWFS CNPANPNHYFKKNWIDKQVE 

22818 aagcgtatcttatatcttcactttacaatggacgacaaccctagcttgacggatagcattaaaaggcgctatgagaaaatgtat 

197 KRI LYLHFTMDDNPSLTDS I KRRYEKMY 

22902 gctggagtcttcaggaaaagatttattctcggcctttgggtaacagcagatggtctagtttattcaatgttcaatgaagagcag 

225 AGVFRKRFILGLWVTADGIiVYSMFNEBQ 

22986 catgtcaaaaagctcaatatagaattcgaccgtttattcgtagcaggcgactttggtatctataatgcaacaaccttcggcctt 

253 HVKKLNIEFDRLFVAGDFGIYNATTFGL 

23070 tatggattctcgaaacgtcataagcgctaccatccaattgagtcatactaccactcagggcgcgaggcggaagagcaactaact 

281 YGFSKRHKRYHLIESYYHSGREAEEQLT 

23154 gaggcggatgttaattcgaatattcaatttagttcagttctacaaaagactactaaagagtacgcaaatgatttagtcgatatg 

309 EADVNSN I QFSSVLQKTTKEYANDLVDM 

23238 atacgaggaaagcaaaccgaatatataattctcgacccgtctgcttctgctatgattgttgaacttcaaaagcatccttatata 

337 IRGKQIEYIILDPSASAMIVELQKHPYI 

23322 gctagaaagaatatccctatcattcctgctcgaaatgacgtgacgcttggcatttcatttcacgctgaactcttggctgagaat 

365 ARKNIPI I PARNDVTLGISFHAELLAEN 

23406 agatttacactcgaccctagcaacacgcacgacattgatgaatactatgcttacagctgggacagcaaagcgagccaaacggga 

393 RFTLDPSNTHDIDEYYAYSWDSKASQTG 

23490 gaagatagagtcattaaagagcatgaccactgcatggataggaacagatatgcctgtctcactgacgctctaatcaacgatgac 

421 BDRVIKEHDHCMDRNRYACLTDALINDD 

23574 ttcggtttcgaaatacaaatattatccggaaaaggcgctagaaactaa 23621 

449 FGFEIQI LSGKGARN* 
dplORFOOS 

49624 gtgatacagcttcaagtcttaaataaagttctcgaagaaaagagcttatccattttagaaaataatggaattgaccaagaatac 

1 VIQLQVLNKVLEEKSLSILENNGIDQEY 

49708 ttcacggatcatttagacgagtatcaatttattcaagaacacttttcgagatatggaagagttccggacgacgaaactattctc 

29 FTDYLDEYQFIQE HFSRYGRVPDDETIIi 

49792 gaccattttcctggattcgaatttttcgaaattggcgaaactgatgaataccttatcgacaagctaaaagaggagcatctatat 

57 DHF PGFEFFEIGETDEYLIDKLKEEHLY 

49876 aattcacttgttccaattttaacggaagcggctgaggacattcaagtagatagtaacattgcgattgcgaatataattccaaaa 

B5 NSLVPILTBAAEDIQVDSNIAIANIIPK 

49960 ctagaagaacttttcaatcgctctaaattcgtaggcggactagacattgctcgaaatgctaaacttcgactagactgggcgaat 

113 LEELFNRSKFVGGLDIARNAKLRLDWAN 

50044 actattagaaaccatgacggtgaaagacttggaatatcgacagggtttgaactattggacgacgtgcttggaggcttacttcct 

141 TIRNHDGERLGISTGFELLDDVLGGLLP 

50128 ggtgaggattcgattgtcataatggctcgacctggacaaggcaagtcgtggactattgataaaatgcttgcaactgcttggaag 

169 GED L I V I MARPGQGKSWTIDKMLATAWK 

50212 aacgggcatgatgcccttctatatagcggggaaacgagtgaaatgcaagttggtgctcgtatagatactattctttcgaatgtt 

197 NGHDVLLYSGEMSEMQVGARIDTILSNV 

50296 agcatcaatccaattaccaaagggatttggaacgaccatcagttcgaaaaatatgaggaccatattcaagcaatgactgaggct 

225 SINSITKGIWNDHQFEKYEDHIQAMTEA 

50380 gaaaattcccttgtggtagtcacgccctttatgattggaggaaagaaccttacccctgcaattttagatagcatgatatctaaa 

253 ENSLVVVTPFMIGGKNLTPAILDSMISK 

50464 tatagaccatctgtggtggggattgaccagctttcactcatgagcgagtcttatccaagcagggagcagaagcgaatccagtac 

281 YRPSVVGIDQLSLMSESYPSREQKRIQY 

5054 8 gccaacatcaccatggacctatataagatttctgctaaatatggaattcctattgtgcttaatgtccaagcagggcgttcggct 

309 ANI TMDLYKISAKYG I PIVLNVQAGRSA 

50632 aaaactgaaggcgctgaaagtacggaactagaacatatagcagaaagtgatggagtaggtcaaaatgctagcagagttatcgct 

337 KTEGAESMELEHIAESDGVGQNASRVIA 

50716 atgaagcgtgacgaaaaatccggcatacttgaactatctgtcgttaaaaaccgatatggcgaagaccgaaaaaccatcgaatat 

365 MKRDEKSGILELSVVKNRYGEDRKI IEY 

50800 atgtgggacgttgaaactggaacctatactcttataggattcaaagaggaaggcgaagaaggaactgaaaaaggcgaaagctct 

393 MWDVETGTYTLIGFKEEGEEGTEKGESS 

50884 ccattgaaagcaaaagcctctaggtcgactgctcgtcttcgaagtaaggttacaagggaaggagttgaagcattttga 50961 

421 PLKAKAS RSTARLRS KVTREGVEAF* 
dplORP009 

13160 atgacagactttaaaaaacgcttcaagaaagcagtaacagaaacaatcaatcgtgacggtatcgagaaccttatggattggctc 

1 MTD FKKR F KKAVTET I NROG I E N - L M_ D -W t " 

13244 gaaaatgataccaatttcttctcaagtccagcaagcactcgataccatggaagctatgaaggtggacttgtcgagcactcatta 

29 ENDTNFFSSPASTRYHGSYEGGLVEHSL 

13328 aacgcgttcaatcaactacttttcgaaacggaCaccatggtaggcaaaggctgggaagacatctacccaatggaaacagttgca 

57 NVFNQLL FEMDTMV GKGWED IYPMETVA 

13412 atcgtagcactatttcacgacctttgcaaagttggtcagcatcgtgaaactgaaaaatggcgcaagaacagcgacggtgaatgg 

85 IVALFHDLCKVGQYRETEKWRKNSDGEW 

13496 gaaagctatttagcatatgaatacgaccctgagcaacttacaatgggacatggtgcaaaatctaatttccttcttcaacgtttc 
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113 ESYLAYEYDPEQLTMGHGAKSN fllqrf 

13580 actcaactcacgccagttgaagctcaagcaattttctggcatatgggagcctatgatattagtccttatgcaaatttgaatgga 

141 IQLTPVEAQAIFWHMGAYDIS PYANLHG 

13664 tgtggagcagccttcgaaactaatccacttgcatccttaacccatcgcgcagatatggccgcaacttatgtagtcgaaaacgaa 

169 CGAAFETNPLAFLIHRADMAATYVVENE 

1374 6 aacttcgaatactctcaaggtccagttgaacaagaggctgaggttgaagaagtagttgaagaaaaacctaagagttcaactcgt 

197 NFEYSQG PVEQEAEVEEVVEEKPKSSTR 

13832 aagaaacctgcgcctaaggaagaaaaagttgaagaggctgaagaaaaaccaaaagctggaatcactcgacgtcgcaaacctgcg 

225 KKPAPKEEKVEEABEKPKAGI TRRRKPA 

13916 ccaaaagaggaagaggtagaagagcctaaagaagagcctaagaaagcatcttctaaaattcgaatgcctaaaaagactgaaaag 

253 PKEEEVEEPKEEPKKASSKIRMPKKTEK 

14000 gtcgaagaggtagaaagcgcagacgagccgaaagttgaagaagcagaggacgacaatgtggtggtacctgctggatatgttcga 

281 VEEVESADEPKVEEAEDDNVVVPAGYVR 

14084 gacgcctactacttctacagtgaagtcgctgacgtttactacaagaaagatgtcgacgagcctgacgatgacagcgacattctt 

309 DVYYFYSEVADVYYKKDVDEPDDDSDIL 

14168 gtagacgaagaagagtacatggacgcaatgtgtcccgtattagaagaagacctcctctacgaacttgacggcaaggttcacaaa 

337 VDEEEYMDAMCPVLEEDFFYELDGKVHK 

14252 ttagcaaaaggtgaacgcttgccggaagaatacgacgaagaaacttgggaacctatcactgaagcagaatacatcaagcgaaca 

365 LAKGERLPEEYDEETWEPITEAEYIKRT 

14336 gaaaaacctaaagcagttgcaaaacctactcgaaaaactccagcgccttctcgtcgccctcgcccttaa 14404 

393 EKPKAVAKPTRKTPAPSRRPRP * 

dplORFOlO 

8699 atgaaattggaacagttgatgaaggactggaataaggattcgaaagctcttgtagcagttcaaggacttgaacgtgaagcgctt 

1 MKLEQLMKDWNKDS KALVAVQG LEREAL 

8783 ccaagaatccctttttctgcgccttctatgaattatcaaacctacggcgggctccctcgaaaaagggtagttgaattcttcggt 

29 PRI PFSAPSMNYQTYGGLPRKRVVEFFG 

8867 cctgagtcaagtgggaaaactacttcagctctcgacattgtcaagaatgcgcaaatggtatttgagcaggaatgggaacagaag 

57 PESSGKTTSALDIVKNAQMVFEQEWEQK 

8951 actgaagaactcaaggaaaagctggaaaatgcgcgtgcatccaaagctagcaagactgctgtcaaggaacttgaaatgcaactc 

85 TEELKEKLENARASKAS KTAVKELEMQL 

9035 gatagtcttcaagagcctcttaagattgtatatcttgaccttgagaatacattagacactgagtgggctaaaaagattggagtc 

113 DSLQEPLKIVYLDLENTLDTEWAKKIGV 

9119 gatgttgacaatatttggatagttcgccctgaaatgaacagcgctgaagaaatacttcaatatgttttagacattttcgaaaca 

141 DVDNIWIVRPEMNSAEE XLQYVLDIFET 

9203 ggtgaagttggcctagtagttctagattccttgccttacatggtcagtcaaaaccttattgatgaagagttgactaaaaaggcc 

169 GEVGLVVLDSLPYMVSQNLI D EELTKK A 

9287 tatgcaggaatctcagcgcctttgactgaatttagtcgaaaggttactcctcttcttactcgctacaatgcaatattcctaggc 

197 YAGI SAPLTEFSRKVTPLLTRYNAIFLG 

9371 atcaatcaaattcgagaagatatgaatagtcagtacaatgcctattcaactccaggcggaaagatgtggaagcatgcttgtgca 

225 I N Q I REDMNSQYNAYSTPGGKMWKHACA 

9455 gttcgacttaaatttagaaaaggtgactaccttgacgaaaacggtgcatcattgacccgtactgctcgaaaccctgcagggaat 

253 VRLKFRKGDYLDENGASLTRTARNPAGN 

9539 gtagtagagtcattcgtcgagaagaccaaagcatttaagccggacagaaaattagtttcctatacgctttcctatcatgatgga 

281 VVES FVEKTKAFKPDRKLVS YTLSYHDG 

9623 attcaaattgaaaatgaccttgtagatgtcgctgtcgaatttggagtcattcaaaaggcaggggcatggttcagtatcgtcgac 

309 IQIENDLVDVAVEFGVIQKAGAWFSIVD 

9707 cttgaaactggagaaattatgacagatgaagacgaagaaccatcgaagttccaaggcaaggcaaatctagttcgacgcttcaag 

337 LETGEIMTDEDEEPLKFQGKANLVRRFK 

9791 gaggatgactacttattcgacatggtgatgactgcggttcacgaaattatcactcgagaagaaggctaa 9859 

365 EDDYLFDMVMTAVHEIITREEG* 
dplORFOll 

28017 atgaatatttatgattatatcaacgcaggggagattgctagctacattcaagcacttccttcaaacgctcttcaataccttgga 

1 MNIYDYINAGEIASYIQALPSNALQYLG 

28101 ccaactcttttccctaatgctcaacaaacagggacagacatttcatggctcaagggtgcaaataatttgccagtaactatccag 

29 PTLF PNAQQTGTD I SWLKGANNLPVT1Q 

28185 ccatctaactacgacgcgaaagcaagtcttcgtgaacgtgctggatttagcaaacaagctactgagatggcattcttccgtgag 

57 PSNYDAKASLRERAGFSKQATEMAFFRE 

28269 t ct atgcgact tggt gaaaaagaccgt caaaact tgcaaatgctat tgaaccaaagt tcagct ct tgcccaaccac t tatcact 

85 SMRLGEKDRQNLQMLLNQSSALAQPLIT 

28353 caactctataatgatactaagaaccttgtagacggtgttgaagcgcaagcagaatacatgcgtatgcaattgcttcaacacggt 

113 QLYNDTKNLVDGVEAQAEYMRMQLLQYG 

28437 aaattcactgtcaaatcaactaacagcgaggctcaatacacttacgactacaacatggatgctaagcaacaatatgcagtcact 

141 KFTVKSTNSEAQYTYDYNMDAKQQYAVT 

28521 aagaaatggactaacccagctgaaagtgaccctatcgctgacattttagcagcaatggatgacatcgaaaatcgtacaggcgtt 

169 KKWTNPAESDPIADI L A A M D D I ENRTGV 

28605 cgccctactcgaatggtcttgaaccgaaacacttataaccaaatgactaagagtgactctatcaagaaagctcttgcaattggt 

197 RPTRMVLNRN. TYNQMTKSDS I KKA L A I G 

28689 gttcaaggttcttgggaaaacttcttgcttcttgcaagtgacgctgagaaattcatcgctgaaaaaacaggtct^caa&tcgcc 

225 VQGS WEN FLLLASDAEKF IA E KTG_J*QXA 

28773 gtctactctaagaaaattgctcagttcgctgacgctgacaaacctcctgacgttggtaacattcgtcagttcaacttgattgac 

253 VYSKKIAQFADADKLPDVGNI RQFNLID 

28857 gacggtaaagtggtattgcttccacctgacgcagttggtcacacttggtacggtactaccccagaagcattcgacttggcttca 

281 DGKVVLL P PDAVGHTWYGTT P EAFDLAS 

28941 ggcggaacagacgcccaagttcaagttctttcaggcggacctaccgttacaacttatcttgaaaaacatcctgtcaacactgca 

309 GGTDAQVQVLSGGPTVTTYLEKHPVNIA 
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29025 acagttgtatcagctgttatgattccatcattcgaaggaattgactatgtaggagttctcacaactaattag 29096 

337 TVVSAVMI PSFEG IDYVGVL TTN* 
dplORF012 

5346 atgagtattaagttcaaaaccgaagaactttcaaaaattgtttctcagctcaataagttgaagcctagcaagttgctagaaatc 

1 MSIKFKTEELSKIVSQLNKLKPSKLLEI 

5430 acaaactattggcatatttttggtgacggcgaatgcgtcatgtttacagcgtatgatggctcaaacttccttcgatgcattatc 

29 TNYWHI FGDGECVMFTAYDGSNFLRCII 

5514 gacagcgatgttgaaattgacgtgattgtgaaagcagagcagtttggaaaacttgtagaaaagaccacggccgcaaccgtcaca 

57 DSDVE I DVIVKAEQFGKLVEKTTAATVT 

5598 ttagttcctgaagaatcttcgctaaaagttattgggaatggtgagtacaatattgatattgttacagaagatgaagagtaccct 

85 LVPEESSLKVIGNGEYNIDIVTEDEEYP 

5682 acattcgaccacttgctcgaagacgtgagtgaagaaaatgctctcactttgaaaagctcgctgttctacggaatcgccaatatc 

113 T FDHLLEDVS EENALTLKSS L FYG I. AMI 

5766 aacgattctgcggtatctaaatcaggagcagatggaatttataccggcttcctgttaaaaggcggaaaagcaattactacagac 

141 NDSAVSKSGADGIYTGFLLKGGKAITTD 

5850 atcattcgcgtatgtatcaaccctatcaaggaaaagggactagaaatgctcattccttacaacctaatgagtattttagcaagt 

169 IIRVCINPIKEKGLEMLI PYKLMSI LAS 

5934 attcctgatgagaagatgtacttctggcaaattgacgatactactgtctatatttcatcggcttcagtcgaaatttatggaaaa 

197 IPDEKMYFWQIDDTTVYISSASVEIYGK 

6018 ttgatggaaggtatggaagattatgaagacgtttcacagcttgactcaattgagtttgaagatgatgcggctatccctacagca 

225 LMEGMEDYEDVSQLDSIEFEDDAAIPTA 

6102 gaaatcctgagcgtattagaccgccttgtactattcacttcagcctttgacaaaggaaccgtcgaattcttattcttgaaagac 

253 EILSVLDRLVLFTSAFDKGTVEFLFLKD 

6186 cgacttcgaattaaaacttctactagcagttatgaagacatcatgtacgcatctgctggcaagaaagtttcgaagaaagaattc 

281 RLRIKTSTSSYEDIMYASAGKKVSKKEF 

6270 act tgccaccttaacagct tact cttgaaggaaattgtatcaaccgtcaccgaagaaaacttcactgtctcttatggaagcgaa 

309 TCHLNSLLLKE IVSTVTEENFTVSYGSE 

6354 accgcaattaagatttcatcgaatggtgtcgtttacttcctagcacttcaagagccggaagaataa 6419 

337 TAI KISSNGVVYFLALQEPEE * 
dplOR7013 

10215 atgaatttagcttctaaataccgtcctcaaactttcgaggaagtggtagctcaagaatatgtcaaagaaattcttttgaatcaa 

1 MNLASKYRPQTFEEVVAQEYVKEILLNQ 

10299 ttacaaaatggcgctatcaaacacggctatctattctgtggtggcgctggaactggtaaaaccactactgctcgaattttcgcg 

29 LQNGAI KHGYLFCGGAGTGKTTTARI FA 

10383 aaggatgtgaacaaaggacttggctctcctattgaaattgatgctgcttctaataatggggtagaaaatgttcgaaacattatt 

57 KDVNKGLGSPIEIDAASNNGVENVRNII 

10467 gaagattctagatacaagtctatggacagcgagttcaaagtttacatcattgacgaggttcatatgctttcaaccggagcattt 

85 EDSRYKSMDSEFKVYI IDEVHMLSTGAF 

10551 aatgcgctgttgaaaacattagaagagccctcatcgggaaccgtgttcattctatgtactactgaccctcaaaagattcctgac 

113 NALLKTLEEPSSGTVFILCTTDPQK IPD 

10635 actattctcagtcgagttcaacggtttgactttactcgaattgataatgacgacatcgttaatcaacttcaatttattatcgaa 

141 TILSRVQRFDFTRIDMDDIVNQLQFIIE 

10719 agtgaaaatgaagaaggagctggttatagttatgagcgtgacgccctttcgtttattgggaaacttgcaaatggaggaatgcgt 

169 SENEEGAGYSYERDALSFIGKLANGGMR 

10803 gacagtatcacaaggctcgaaaaagtccttgattatagtcatcacgttgacatggaagccgtttctaatgcactaggagttccg 

197 DSITRLBKVLDYSHHVDMBAVSNALGVP 

10887 gactacgaaacattcgcttcacttgttgaagctattgccaactatgacggctcaaagtgtttagaaattgtaaatgacttccac 

225 DYETFASLVEAIANYDGSKCLE IVNDFH 

10971 tactcaggaaaagacttgaaattagtgactcgaaactttacagacttccttttagaggtttgtaagtattggctagttcgagat 

253 YSGKDLKLVTRNFTDFLLEVCKYWLVRD 

11055 atttcaatcactcaacttcctgctcattttgaaagtaagctagagcaattctgtgaggcttttcaatatcctactctattgtgg 

281 ISITQLPAHFESKLEQFCEAFQYPTLLW 

11139 atgctagaagaaatgaatgaacttgctggagttgttaaatgggagcctaatgctaaaccgataattgaaaccaaacttcttttg 

309 MLEEMNELAGVVKWEPNAKP I I ETKLLL 

11223 atgagcaaggaggagtga 11240 

337 M S K E E * 
dplORF014 

50961 atgaaagtaaatggtcttcaaattgaagcgactcctgaacaaataattgaaaaactttcgagacaacttgaagacgaaggaaca 

1 MKVNGLQIEATPEQI IEKLSRQLEDEGT 

51045 ttcatttttagacgaactaagtcgcttggaagcaactatcaattctcatgcccgtttcatgcaggagggactgaaaagcatccc 

29 FIFRRTKSLGSNYQFSCPFHAGGTEKHP 

51129 tcttgtggcatgagtagaaatccttcttattcaggaagtaaggtgacggaagctggaacggttcactgtttcacttgcggctac 

57 SCGMSRMPSYSGSKVTEAGTVHCFTCGY 

51213 acttcaggactaactgaattcgtctcgaatgtattaggtcgaaacgatggagggttctatggaaaccagtggctgaaaaggaat 

85 TSGLTEFVSNVLGRNDGGFYGNQWLKRN 

51297 tttggaacatctagcgaagtagttaggcaaggcgtcagccctgaagcgtttcgaagaaatgggagaactgaaaaagtcgagcat 

113 FGTSSEVVRQGVSPEAFRRNGRTEKVEH 

51381 aaaatcattcctgaagaggaacttgataaataccggtt tat teat ccttatatgtatgaacggaaattgacggacgagct cat c 

141 KIIPEEELDKYRFIHPYMYBRKL-T D ^E -fj i ? 

51465 gagatgtttgatgtaggttatgacaaactgcatgattgcatcacctttccagtacggaacctcaagggcgaaacagtattcttc 

169 EMFDVGYDKLHDCITFPVRNLKGETVFF 

51549 aaccgtcgaagtgttcgttctaagtttcaccagtacggtgaagatgaccctaaaacggaatttctttatggccaatatgagctt 

197 NRRSVRSKFHQYGEDDPKTE FLYGQYEL 

51633 gtagcatttcgagactattttgaaaaacctattagtcaagtattcgtgactgagtctgttatcaactgcttgactctttggtca 

225 VAFRDYFEKPISQVFVTESVIMCLTLWS 

51717 atgaagattccagcagtcgctcttatgggagtaggtggaggaaatcaaatcaatttactaaaacgacttccttatagaaatatt 
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253 MKIPAVALMGVGGGNQINLLKRLPYRNI 

51801 gttctagcacttgaccctgataacgctgggcagacagcgcaggaaaaactctaccgacagttaaagcgaagcaaggtcgttaga 

281 VLALDPDNAGQTAQEKLYRQLKRSKVVR 

51885 tttttgaactaccctaaagagttctatgataataagtgggatataaacgaccatccggaattattaaattttaatgatttagtc 

309 FLNYPKEFYDNKWDINDHPELLNFNDLV 

51969 ttgtag 51974 

337 L * 
dplORF015 

3793 atgggatttaatctatacttcgcaggaggtcacgctattagcactgacgattatttgaaggaaagaggagccaatcgcctattc 

1 MGFNLYFAGGHAISTDDYLKERGANRLF 

3877 aatcaactgtacgaaagaaacgggattggcaaaaggtggattgagcataagaaaaccaatccaagcactacttcaaaactattc 

29 NQLYERNGIGKRWIEHKKTNPSTTSKLF 

3961 gtcgactctagtgcatattctgctcataccaaaggggctgaagttgacattgacgcctatatcgaatacgtgaatgataacgtg 

57 VDSSAYSAHTKGAEVDIDAYIBYVNDNV 

4045 ggaatgtttgactgtatcgccgaactcgataaaattcctggtgtatttagacagcctaagacacgtgaacagcttttggaagca 

85 GMFDCIAELDKI PGVFRQPKTREQLLEA 

4129 ccacaaatttcttgggataattatctatacatgcgcgagcgaatggttgagaaagacaagctcttacctattttccatatggga 

113 PQI SWDNYLYMRERMVEKD KLLPIFHMG 

4213 gaagactttaaatggctcaacttgatgctcgaaactacattcgaaggcggaaagcatattccttacattggaatttcaccagcc 

141 EDFKWLNLMLETTFEGGKHI PYIGISPA 

4297 aatgactcgactacgaagcacaaagacaagtggatggaaagagtattcgaagttattcgaaacagttctaatccagacgttaag 

169 NDSTTKHKDKWMERVFEVIRNSSNPDVK 

4381 actcacgcatttgggatgacagttactagccaattagagcgtcacccattctatagcgccgactctacttctgtactgctcaca 

197 THAFGMTVTSQLERHP FYSADSTSVLLT 

4465 ggagcgatgggaaacattatgacgtcaaaaggattagttgacttgtcacagaagaatggaggaattgatgccgtccgtaggctg 

225 GAMGNIMTSKGLVDLSQKNGGIDAVRRL 

4549 ccaaaaccggttcaagttgaaattgaatccattatcgaagaaactggagcgcattttagcccagagcaattagttgaggactat 

253 PKPV. QVEIESIIEETGAHFSLEQLVEDY 

4633 aaacttcgagcattgttcaatgttcaatacatgctgaattgggcagagaactatgaattcaagggaattaaaaatcgtcaacgt 

281 KLRALFNVQYMLKWAENYE FKGI K N R Q R 

4717 cgactattttag 4728 

309 R L F * 
dplOR70X6 

43413 atgggagtcgatattgaaaaaggcgttgcgtggatgcaggcccgaaagggtcgagtatcttatagcatggactttcgagacggt 

1 MGVD I EKGVAWMQARKGRVSYSMDFRDG 

43497 cctgatagctatgactgctcaagttctatgtactatgctctccgctcagccggagcttcaagtgctggatgggcagtcaatact 

29 pDSYDCSSSMYYALRSAGASSAGWAVNT 

43581 gagtacatgcacgcatggcttattgaaaacggttatgaactaattagtgaaaatgctccgtgggatgctaaacgaggcgacatc 

57 EYMHAWLI ENGYELI S ENAPWDAKRGD I 

43665 ttcatctggggacgcaaaggtgctagcgcaggcgctggaggtcatacagggatgttcattgacagtgataacatcattcactgc 

85 FIWGRKGASAGAGGHTGMFIDSDNI IHC 

43749 aactacgcctacgacggaatttccgtcaacgaccacgatgagcgttggtactatgcaggtcaaccttactactacgtctatcgc 

113 NYAYDGISVNDHDERWYYAGQPYYYVYR 

43833 ttgactaacgcaaatgctcaaccggctgagaagaaacttggctggcagaaagatgctactggtttctggtacgctcgagcaaac 

141 LTNANAQPAEKKIiGWQKDATGFWYARAN 

43917 ggaacttatccaaaagatgagttcgagtatatcgaagaaaacaagtcttggttctactttgacgaccaaggctacatgctcgct 

169 GTYPKDEFEYIEENKSWFYFDDQGYMLA 

44001 gagaaatggttgaaacatactgatggaaactggtattggttcgaccgtgacggatacatggctacgtcatggaaacggattggc 

197 EKWLKHTDGNWYWFDRDGYMATSWKRIG 

44085 gagtcatggtactacttcaatcgcgatggttcaatggtaaccggttggattaagtattacgataattggtattattgtgatgct 

225 ESWYYFNRDGSMVTGWI KYYDNWYYCDA 

44169 accaacggcgacatgaaaccgaatgcgtttatccgttataacgacggctggtatctactattaccggacggacgtctggcagat 

253 TNGDMKSNAFIRYNDGWYLLLPDGRLAD 

44253 aaacctcaattcaccgtagagccggacgggctcattactgctaaagtttaa 44303 

281 KPQFTVEPDGLITAKV* 
dplORF017 

11242 atgattggacagggacttgttaaatctaccatttcgaaatggaaacaacttccaaaatatataatcgtcgaaggtgaagtaggt 

1 MIGQGLVKSTISKWKQLPKYI IVEGEVG 

11326 tcaggacggaagaccttaatccgttatattgcttcgaaatttgacgctgattctattgtagtaggaacgagtgtagatgacatt 

29 SGRKTLIRYIASKFDADSIVVGTSVDDI 

11410 cgaaacatcattcaggatgcacagactattttcaaggcgagaatctacgtgatagacggaaatagcctgtcaatgtcagctctt 

57 R N I IQDAQTI FKARIYVIDGNSLSMSAL 

11494 aactcgcttttgaagatagcggaagagccacctttaaactgtcatatagccatgactgttgatagcatcaataatgctttacct 

85 NSLLKIAEEP PLNCHIAMTVDSINNALP 

11578 acgcttgcaagtagagcaaaagttctaaccatgctaccttatactaatgaagagaaaatgcagtttgtcaagtcctacaagaag 

113 TLASRAKVLTMLPYTNEEKMQFVKSYKK 

11662 gtagatacttcaggaattgacgaccgagcgactgtagactattgcaatcttgccagcaatcttcaaatgcttgaagacatatta 

141 VDTSGIDDRAIVDYCNLASNLQM-L E "^_D- IT 

11746 gaatatggcgcagaagagctatttgaaaaggttacaacatttcatgacttaatatgggaggcaagtgctag§aattcgctaaag 

169 EYGAEELFEKVTTFYDLIWEASASNSLK 

11830 gttactaattggcccaaatttaaggaaactgatgaaggaaaaattgagcctaaacttttcctcaactgtcttttaaattggtcg 

197 VTNWLKFKETDBGK I E PKLFLNCLLNWS 

11914 acagttgtcatcaggaagcactatgtagaaatgtctttcgaagaacttgaggcccatgaccttttagtgagggaagcacctagg 

225 TVVIRKHYVEMSFEELEAHDLLVREASR 



WO 00/32825 



PCT/DB99/02040 



j / 2 



11998 tgtttgcgaaaggtacctaaaaagggctcaaatgcgcgtgtctgcgtgaacgaatttatcaggagggtcaaacaagttgagtga 
12081 

253 CLRKVSKKGSNARVCVNEFI RRVKQVE* 
dplORP018 

35847 atggctagcagacagacgctattggtcgacggaattgaccttgtcgacaaaggtgcaaccgtgctagaatatgtaggactcact 

1 MASRQTLLVDGIDLVDKGATVLEYVGLT 

35931 ttcgcaggatttaaggactcaggatttaaaaaccctgaaggcatagacggagtattagattctccgtctaatgctatgtccgct 

29 FAGFKDSGFKNPEGIDGVLDS PSNAMSA 

36015 cttactggaagcgtgaccttaatgttccacggagaaaccgaaaagcaagttaatcaaaaatacaggcagttcaaacaatttatt 

57 LTGSVTLMFHGETEKQVNQKYRQFKQFI 

36099 cgctcgaagtcattctggagaatttcgacacttgaagaccctggatactatcgaacgggaaaacttttaggagaaaccgagcaa 

85 RSKS FWRI STLED PGYYRTGKFLGETEQ 

36183 ggaaaacttgtagacgttcaagcctttaaagatacttcccttgtagttaaattagggattcagttcaaagatgcttacgagtac 

113 GKLVDVQAFKDTS L V V K I* G I QFKDAYEY 

36267 agcgactcaactgttcgaaaggcttacaagtctcaacccgctttgggaggcgatagcttacctaacccaggaagacctactcga 

141 SDSTVRKVYKFQPALGGDSL PNPGRPTR 

36351 caatttagagtagaaataagaactacttctcaaatcaaaggatattttcgaattggcgaaaaaagttcaggacagtttgttgag 

169 QFRVE IRTTSQIKGYFRIGEKSSGQFVE 

36435 ttcggtactaattcagtattgatggaaagtggctcgatcattattctaaatcttggaacttttgaacttattaaaattagcagt 

197 FGTNSVLMESGSI IILNLGTFELIKISS 

36519 gcaaatcaagcgactaacttatttagatacattaaacgaggcgcattcttcaagattcctaatggaaattcaacaattaccatt 

225 ANQATNLFRYIKRGAFFKI PNGNSTITI 

36603 gaataccgagccgatgacgcagcagcttggacctctactcttcccgctcaagttgaactgtttctaaatccgtcttactattag 
36686 

253 EYRADDAAAWTSTLPAQVELFLNPSYY* 
dplORF019 

12161 atgaatgtttatctcaatcaaatgggaaatgtagttcgagaaacttcggtttcaacagtctggaaaaccctcactcaaaaaggg 

1 MNVYLNQMGNVVRETSVSTVWKTLTQKG 

12245 ctcgtttctaatcatcgaatattcgctgttcgagatgataaggagtttctgtctaatgagtcgaggtggaaaaggcttccggat 

29 LVSNHRI FAVRDDKEFLSNES RWKRLPD 

12329 gttagatatgggacacttgttttgatggttactaaaattgacaagcgaagcaagttgctaaaggcctttcctgataattgtgtt 

57 VRYGTLVLMVTKIDKRSKLLKAF PDNCV 

12413 gagtttgagaaaatgactgacgcgcagttgaaaaggcattttgtgtctaaatactcgactattgatagcgacatgattgacatg 

85 EFEKMTDAQLKRHFVSKYSTIDSDMIDM 

124 97 gttatccagttctgtctaaacgattactctagaattgacaatgaattggacaagctgtcgcgattgaaaaaggttgacgcatca 

113 VIQFCLNDYSRIDNELDKLSRLKKVDAS 

12581 gtagttgaatccattgtcaagcacaagaccgaaattgacattttcagcctagttgatgatgtattggaatataggccggagcag 

141 VVES IVKHKTEID I FSLVDDVLEYRPEQ 

12665 gcaattatgaaagtgactgaacctttagccaaaggagaaagtcctattggattgcttaccttgctttatcaaaattttaataac 

169 AIMKVTELLAKGESPIGLLTLLYQNFNN 

12749 gcttgtcttgtgctaggagccgatgagcctaaagaagccaatctaggcattaagcagttcttaatcaataagattgtctataac 

197 ACLVLGADEPKEANLGI KQFLINKIVYN 

12833 tttcaatacgagctggactcagcctctgaaggcatggctattttaggtcaagctatcgagggcataaagaatggtcgctataca 

22S FQYELDSAFEGMAILGQAIEGIKNGRYT 

12917 gaaagttcagtggtctatatttctttgtataaaattttttcacttacttaa 12967 

253 ESSVVYISLYKIFSLT* 
dplORF020 

1864 atggttaatcaatacaatcagcctgaaagaggcaagattcgaatcaatgttcgcgaccctgagaaaatgcctatcatggaaatt 

1 MVNQYNQPERGKIRI NVRDPEKMPIMEI 

1948 ctcggtcctacaattcaaggtgaaggaatggttataggtcaaaagactattttcattcgaactggtggatgcgactatcatcgc 

29 FGPTIQGEGMVIGQKTIFIRTGGCDYHC 

2032 aactggtgtgactcagcctttacctggaacggtactactgagccggaatatatcacaggcaaagaagctgctagtcgaatcttg 

57 NWCDSAFTWNGTTEPEYITGKEAASRIL 

2116 aaactagctttcaatgataaaggtgaacagatttgtaaccacgtgacattgactggaggaaatcctgccttaatcaacgagcct 

85 KLAFNDKGEQICNHVTLTGGNPALINEP 

2200 atggctaagatgatttcgattctaaaagaacatggattcaagtttggtctcgaaactcaaggaactcgattccaagaatggttc 

113 MAKMISILKEHGFKFGLETQGTRFQEWF 

2284 aaagaagtaagcgatatcactattagtcctaaaccgccttcaagtggaatgagaactaatatgaaaattcttgaagctattgta 

141 KEVSDITISPKPPSSGMRTNMKILEAIV 

2368 gatagaatgaatgatgaaaaccttgactggtcatttaaaatcgttatctttgacgaaaatgacctagcttatgcgcgtgatatg 

169 DRMNDENLDWSFKIVI FDENDLAYARDM 

2452 tttaaaactttcgaaggcaagttacgtccagtgaactacctttcagttgggaatgcaaacgcatacgaagaaggaaaaatcagt 

197 FKTFEGKLRPVNYLSVGNANAYEEGKIS 

2536 gataggcttcttgaaaagttgggatggctttgggataaagtgtatgaagacccagctttcaacaatgttcgacctttaccgcaa 

225 DRLLEKLGWLWDKVYEDPAFNNVRPLPQ 

2620 cttcatacacttgtttatgataataaaagaggagtataa 2658 

253 LHTLVYDNKRGV* 
dplORF021 

2504 acgcaaacgcacacgaagaaggaaaaatcagtgataggcttcttgaaaagtcgggatggctttgggataaagtgtatg&agacd 

1 MQTHTKKEKSVIGFLKSWDGFGIK_CMKT 

2588 cagctttcaacaatgttcgacctttaccgcaacttcacacacttgtttatgacaataaaagaggagtataaaatgaaaattgag 

29 QLSTMFDLYRNFIH LFMI IKBEYKMKIE 

2672 catctagataaaaccggtaacgtattagggagagagaacggatgggcttcccttaagccggatgaaattgtaaccttggacaat 

57 KLDK IGNVLGRENGWASLKPDE IVTLDN 

2756 actgaggcagccgttcaaagactttttggcctattaggcgaggacgcagaacgtgacgggttgcaagatactccattccgtttt 

85 TEAAVQRLFGLLGEDAERDGLQDTPFRF 
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2640 gttaaagcactcgctgaacataccgtagggtatcgagaagaccctaaacttcatctcgaaaaaacattcgacgtcgaccatgaa 

113 VKALAEHTVGYREDPKLHLEKTFDVDHE 

2924 gaccttgttcttgtgaaagacatcccattcaattctttatgtgagcatcatttagctccgttcgtagggaaggtgcatattgca 

141 DLVLVKDI PFNSLCEHHLAPFVGKVHIA 

3008 tacattcctaaggataagattacaggtctttcaaaattcggtcgagtggttgaaggatacgctaaacgacttcaagtacaagag 

169 YIPKDKITGLSKFGRVVBGYAKRLQVQE 

3092 cgcttgactcaacaaatcgctgacgctattcaggaagttctaaatcctcaagcagttgcggtcatcgtagaggctgagcatact 

197 RLTQQIADAIQEVLNPQAVAVI VEAEKT 

3176 tgcatgagcggacgcggtattaagaagcacggggcaacgacagtgacttcaactatgcgaggtcttttccaagatgacgcatct 

225 CMSGRG I KK HGATTVTSTMRGLFQDDAS 

3260 gctcgagcagaattgcttcagttgattaaaaagtag 3295 

253 ARAE LLQLIKK* 
dplORF022 

30896 atgagtaaagacattctttacggaatcaagctcgtgcaaatcgaggagcttgacccattgactcagttgccaaaagtcggcgga 

1 MSKDILYGIKLVQIEELDPLTQLPKVGG 

30980 gctaactttgtcgtagatacggcagaaacagcagaactcgaagccgtgacctcggagggaactgaagatgtgaaacgcaatgac 

29 ANFVVDTAETAELEAVTSEGTEDVKRND 

31064 acgcgcattcttgctatcgtgcgtactccagaccttttatacggttatgacttaacattcaaggacaacacgcttgaccctgaa 

57 TRI LAIVRTPDLLYGYDLTFKDNTFDPE 

31148 atcatggccctaattgaaggtggtacagtacgtcaacaaggcggaactattgctggatacgacaccccaatgcttgcacaaggc 

85 IMAL I EGGTVRQQGGTIAGYDT PMLAQG 

31232 gcttctaatatgaaaccatttagaatgaacatctatgtgccaaactatgtaggtgactcaattgccaactacgtgaaaatcact 

113 ASNMKPFRMNIYVPNYVGDS IVNYVK1T 

31316 ttgaataactgtaccggtaaagctccagggctttcaatcgggaaagagttctacgctcctgagttcaacatcaaggcacgtgaa 

141 LNNCTGKAPGLS IGKEFYAPEFNI KARE 

31400 gcaaccaaagcaggtttgccagttaagtcaatggactatgtggcacaacttccagcggttcttcgtcgcgtgacattcgatttg 

169 ATKAGLPVKSMDYVAQLPAVLRRVTFDL 

314 84 aacggtggaacaggaaccgccgacgcagttcgagttgaagcaggtaagaagatttctccaaaaccagttgaccctaccttaaca 

197 NGGTGTADAVRVEAGXKIS PKPVDPTLT 

31566 ggtaaggctttcaaaggctggaaagttgaaggagaatcaactatttgggacttcgacaaccacatgatgcctgaccgagacgtc 

225 GKAF KGWKVEGESTIWDFDNHMM PDRDV 

31652 aaactcgtagcacaatttgcatag 31675 

253 KLVAQFA* 
dplORF023 

6419 atggccaagtccaatttaactagaattgcaaagatggttagagcaggaaacagtgaaggtcctgcttcatcttttgtcaattcg 

1 MAKSNLTRIAKMVRAGNSEGPASSFVNS 

6503 ctgacccgggttattgaacgaactcagcctgaatataatccttcgacatattataagcccagcggggttggtggatgtattcga 

29 LTRV I ERTQPEYNPSTYYKPSGVGGCIR 

6587 aaaatgtatttcgaaagaatcggcgagtccattatagataacgcagattctaacctaattgcaatgggcgaagctggaacattt 

57 KMYF ERIGES I I DNADSNLIA MGEAGTF 

6671 aggcacgaagttctccaagagtacatggttaaaatggctgaaatcgatgaggactttgaatggttgaatgtagcagagttcttg 

85 RHEVLQEYMVKMAEIDEDFEWLNVAEFL 

6755 aaagaaaatccagttgaaggaactaccgtcgacgagcgtttcaagaaaaacgattatgaaacgaagtgtaagaacgaacttctt 

113 KENPVEGTIVDERFKKNDYETKCKNELL 

6839 caactttcattcttgtgtgacggactagttcgatataaaggcaagctctacattttagagattaagactgaaaccatgttcaag 

141 QLSFLCDGLVRYKGKLYILEI KTETMFK 

6923 ttcactaaacatactgagccctacgaagaacacaagatgcaagcaacttgctacggaatgtgtctaggagtcgatgatgtcatt 

169 FTKHTE PYEEHKMQATCYGMCLGVDDVI 

7007 ttcctttatgaaaatcgagataacttcgaaaagaaagcctacacgtttcacatcacagacgagatgaaaaatcaagtccttgga 

197 FLYEMRDNFEKKAYTFHITDEMKNQVLG 

7091 aaaattatgacctgcgaagagtatgtagagaaaggcgaaagtcctaaaatctattgctcttcagcctattgcccatattgtaga 

225 KIMT CEEYVEKGES PKIYCSSAYCPYCR 

7175 aaggaaggtcgaaatctgtga 7195 

253 K E G R N L * 
dplORF024 

25992 atgaacgcagtagatggccaggtagttcatatcctacaagtattagcagaagatggaaatgctacggctgaaaagttcgaaaag 

1 MNAVDGQVVHILQVLAEDGNATAEKFEK 

26076 gaagtcagggctgcatctttagtattttcacgaagagcagccgaggcagttgtcaaaggtgaaatctataaggacggcaaaaac 

29 EVRAASLVFSRRAAEAVVKGEI YKDGKN 

26160 ctctcgaaacgtgtttggtcttcagccgcacgcgcaggaaatgatgttcaacaaatagtcacacaaggcctagcaagtggaatg 

57 LSKRVWSSAARAGNDVQQIVTQGLASGM 

26244 tctgctacagatatggctaaaatgctcgagaaatatatcgaccctaaggttcgaaaagattgggactttgataagatagctgag 

85 SATDMAKMLEKYIDPKVRKDWO FDKIAE 

26328 aagctagggaaacctgccgctcacaaacaccaaaatctcgaatacaatgcccttcgacttgctcgaactaccattagccattcc 

113 KLGKPAAHKYQNLEYNALRLARTTISHS 

26412 gccacagctggagtgagacaatggggcaaggttaatccttatgctcgaaaagttcaatggcattctgttcacgctccaggtcga 

141 ATAGVRQWGKVNPYARKVQWHSVHAPGR 

264 96 acgtgtcaagcgtgtatcgattcagatggtgaagtatttcctatcgaagaatgtcctttcgaccatcctaatggaatgtgctac 

169 TCQAC Z DLDG EVFP IEEC PFDHPN G - M C X; 

26580 caaactgtatggtacgaaaactcactcgaagaaatcgctgatgagttgagaggctgggtagacggagaacctaatgatgtatta 

197 QTVWYENSLEEIADELRGWVDGEP-WDVL 

26664 gacgaatggtacgacgatttaagttcaggaaaagttgagaaatacagcgacctcgactttgttaaaagttattag 26738 

225 DEWYDDLSSGKVEK YSDLDFVKSY* 
dplORF025 

18778 atggcaaagaacaaaaagcgaaaaaaagtaaatgtcaaaaggaaaatgcttatccctacaaatctctcgaaaaaagtaaatgca 

1 MAKNKKRKKVNVKRKMLIPTNLSKKVNV 
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18694 aaagcaatcgctcatagaaaagccactgttaagtggctgcctaatacagatgaaattcaagtatatttcgacctttatataaat 

29 KAIAYRKVTVKWLPNTDEIQVYFDLYIN 

18610 aaaaacaggctgacaatgttaggcactattgacccggacaagagctattttgaaggaattaggattgtttgtaagaaacctcag 

57 KNRLTMLGTIDPDKSYFEGI RI VCKKPQ 

18526 ccttggatgactgttaaggagctccaggttgcgcgtgcagacgccccaggtttttttgcagttcttaaagcctattgtcacacg 

85 pwMTVKELQVARADAPGFFAVLKAYCHT 

18442 gttggcgatgtactagatagcggagcagagqctactgaaattgttcaaggtattatgtataaagacggtgaactatttaaggac 

113 VGDVLDSGAEPTEIVQGIMYKDGELFKD 

18358 agtgaaattgtcagccttttcaaatacgatgtcaaagagccttatgagtttccaaaggaccttcctataaccttggacaacttt 

141 SEIVSLFKYDVKEPYEFPKDLP ITLDNF 

18274 ttagagttcattatgtctagccagcatactagagcacttgttttgcgttgtgctaatataggtgagttttccaagaattggcgg 

169 LEFIMSSQHTRALVLRCANIGEFSKNWR 

18190 aaatggcaaaaagctatccagctcctgctcgactatgccaaggcggatgactttaaagtagacgaaactgtttgggacttttca 

197 KWQKAIQLLLDYAKADDFKVDETVWDFS 

18106 cccggctctaaagccggaaaggtagcacgtcgtaaaggctatgaggcaattcaacaagcccttgagcagataaataaataa 
18026 

225 PGSKAGKVARRKGYEAIQQALEQINK* 
dplORF026 

21512 atggcgaaagctactggaccaaaagttcgaagaggaaaaactcctccacggccaaaagacaaaaaaggaatcaaagcaaatgcg 

1 MAKATGPKVRRG K T P PRPKDKKGIKANA 

21596 cgtgtcaataaagaccagttcgtagagtatgactataaaggcatcaagatgacaattaaggaacgtgatgctagaatgaaattg 

29 RVNKDQFVEY DYKG IKMTI KERDARNKL 

21680 gaatttattagaggcatgactattcaggaaattgcagcccgctatggattaaatgaaaagcgtgttggcgaaatacgggctcgc 

57 EFIRGMTIQEIAARYGLNEKRVGEIRAR 

21764 gataaatgggtgaaggctaagaaagagttcgagaatgaaaaggctcttgttactaatgatacattgactcaaatgtatgcaggg 

85 DKWVKAKKEFBNEKALVTNDTLTQMYAG 

2184 8 tttaaagtctcagccaatattaaatatcacgccgcccgggagaaactaatgaacatcgtcgaaatgtgtttagataatcctgac 

113 PK VSVNIKYHAAWEKLMNIVEMCLDNPD 

21932 agatatttatttactaaagaaggaaatattagatggggcgcattagatgtcctttcgaaccttatagatagagctcaaaaagga 

141 RYLFTKEGNIRWGALDVLSNLI DRAQKG 

22016 caagaaagagcgaatggaatgcttccggaagaggttcgatatagactacaaattgagcgcgagaaaattacattgctccgggcc 

169 QERANGMLPEBVRYRLQIEREKITLLRA 

22100 aaaatgggcgaccaggaaattgaaggcgaggttaaagataacttcgtagaagcactagataaagcagctcaagccgtttggcaa 

197 KMGDQEI EGEVKDNFVEALDKAAQAVWQ 

22184 gaatttagtgacgcaacaggttcctacattaaaggagtgactgataatgacaataagcctgagaaataa 22252 

225 EFSDATGSYIKGVTDNDNKPEK* 
dplORF027 

52762 atgggaaaagtatcaattcaaaaatcaggaacatttagctcagggtctaataacgagtttttcacactcgctgaccacggtgac 

1 MGKVSIQKSGTFSSGSNNEFFTLADHGD 

52846 agcgcaattgtcactctattgtatgatgacccggaaggcgaagacatggattatttcgtagtccacgaagcagacgttgacggt 

29 SAIVTLLYDDPEGEDMDYFVVHEADVDG 

52930 cgtcgacgctatatcaattgcaatgctattggcgaagacggggaaacagtccatcctgataattgtccattatgccaaaacgga 

57 RRRYINCNAIGEDGETVHPDNC PLCQNG 

53014 tcccctcgtattgaaaaactatttcttcaactttacaaccatgatacgggaaaagttgaaacatgggaccgaggccgttcttat 

85 FPRIEKLFLQLYNHDTGKVETWDRGRSY 

53098 gttcaaaagattgctacatttatcaataaatatggaagccttgtgactcagccttttgaaattattcgttcaggagctaaaggt 

113 VQKIVTFINKYGS LVTQPFE I I RSGAKG 

53182 gaccaacgaactacttatgaattccttccagagcgtccggaagacagtgctactcttgaagattttccagaaaagagcgaactt 

141 DQRTTYEFLPERPEDSATLEDF PEKSEL 

53266 cttggaactctaattttagacctcgacgaagaccaaatgtttgacgtggttgacggcaagtccactcttcaagaagagcgttct 

169 LGTL I LDLDEDQMFDVVDGKFTLQEERS 

53350 tcaagtcgttcaaattcacgtagaggagcatctcctgcgcctagacgaggttccggtcgagaatcttcacaaggtcgaacagct 

197 SSRSNSRRGASPAPRRGSGRESSQGRTA 

53434 gaaagaactccttcagttagtcgaagaactcctccaacacgaggtcgaggattctaa 53490 

225 ERTPSVSRRTPPTRGRGF* 
dplORF028 

44595 atgtcaaaaattaaattcgaaaaccttaaaaaaggcgatgttgtgctacgagctaaatctcaaacgaagtttaaaatcgtttca 

1 MSKI KFENLKKGDVVLRAKSQTKFKIVS 

44679 attttagcagacgaaaagaaagcagaccttgaatcattagaagacggaggtgaacttcacctttcagcttcaactctcgaacgt 

29 ILADEKKADLESLEDGGELHLSASTLER 

44763 tggcacacaatggaagatgaaactgaacctaaaaaagaagaagctgctaaacccgctaaaaaggctgctcctgcagttgctcga 

57 WYTMEDETEPKKEEAAKPAKKAAPAVAR 

44847 cctgctcgaaaaggtagagtcgttcccaaacctaaaaaagaagtccttgaggaagaaattcctgaagttaaggaacagccggaa 

85 PARKGRVVPKPKKEVLEEEI PEVKEQPE 

44931 gaagttggttcagttagtgagaaatctactgttcgaaaacctgctcctaaaaaagaaagcgtgatggcgattactaaggctctt 

113 EVGSVSEKSTVRKPAPKKESVMAITKAL 

45015 gaaagtcgaattgttgaagcctttcctgcgtctactcgaatcgtcactcagtcttacatcgcctatcgctctaagaagaacttc 

141 ESRIVEAFP ASTRIVTQSYIAYRSK KK F 

45099 gttactatcgaagaaactcgaaaaggtgtttctattggagttcgcgcaaaagggttgacagaagaccaaaagaaac&fectt^ca 

169 VTIEETRKGVSIGVRAKGLTEDQIUKLLA 

45183 tctattgctcctgcatcttacgaatgggcgattgacggaatttttaaactcgtcaaggaagaagatattgacaccgcaatggaa 

197 SIAPASYEWAIDGI FKLVKEEDIDTAME 

45267 ttgattgaagctccccacctctcttcgctatga 4S299 

225. LIEASHLSSL* 
dplORF029 
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662 atgaaatcagtagttttattatccggcggagtcgacccag^ 

1 M K S V V L L S G G V D S A T C L A I E V D K W G S K. N 



746 gttcatgctatagcattcaattacggacaaaagcatgaagcagaacttgaaaatgctgctaatgttgcaatgttctacggagtc 

29 VHAI AFNYGQKHEAELENAANVAMFYGV 

830 aagcccaccattcttgaaattgactcgaaaatctactcaagctctagctcttccttattacaaggaaaaggcgaaatttcacat 

57 k ftilEIDSKIYSSSSSSLLQGKGEISH 

914 ggaaaatcttacgctgaaatcctagcagagaaggaagtagttgacacctatgttccatctagaaatggactaatgctttcacag 

85 G KSYAEILABKEVVDTYVPFRNGLML SQ 

998 gccgcggcttatgcttattcggttggagcttcttacgtcgtatatggtgctcacgcagacgatgcggctggaggcgcttaccct 

113 A A A yaYSVGASYVVYGAHADDAAGGAYP 

1082 qattgcactcctgagttctataattcaatgtcaaatgcaatggaatatggaactggaggcaaggtaacccttgtcgctcctcta 

141 dctPEFYNSMSNAMEYGTGGKVTLVAPL 

1166 cttactctaaccaaggcgcaagtcgttaaatggggaattgatttagatgttccttatttcttgactcgttcatgttatgaaagt 

169 ltltkaqvvkwgidldvpyfltrscyes 

1250 gacgctgaaagttgtggaacttgcgcaacttgtatcgaccgcaaaaaggcattcgaagaaaatggaatgactgaccctattcat 

197 daes c gtcatcidrkkafeengmtdpih 

1334 cataaggagaattga 1348 

225 Y K E N * 

aoos" 030 atgaataacgaaaaaattattgaaaaaattaaaaatcttattcaattagcaaatgacaacccgagtgacgaagaggggcaaact 

i mnnekiiekiknliqlandnpsdeegqt 

20004 qcccttcttatggctcaaaagttgatgctaaagaataatatcgcacttgctcaagttgaacaatttgatgaacctaaacagttc 

„ allmAQKLMLKNNIALAQVEQFDEPKQF 

19920 gagacttctcaagctgttgggaaagaagcaggtcgaatattttggtgggaacgtgaacttggtcatattctcgcgactaatttt 

„ etsqavgkeagrifwwerelgkilatnf 

19836 aggtgcttttgtattaatcagcgcgatatgcgcttgaataaaagtcgaataattttcttc^gcgaaaaacaagacgctgaatta 

85 R C FCINQRDMRLNKSRIIFFGEKQDAEL 

19752 gtgtctaaaatatatgaggctgctttgctttatcttcgttaccgtattgaccgacttcctactcgcgaaccttcctacaagaat 

113 v skiyeaallylryridrlptrepsykn 

19668 tcatacctcaaaggctttttgtcagcctcagccattcgatttaaaaagcaggtggaagaatattcacttatggtcctacctagc 

141 sylkgflsalairfkkqveeyslmvlps 

19584 gagcaaacaaaaaatgcgcttcaggacacatttcgaaatttaaagaaggaaggaattgacagacctcaacatgacttcaatctt 

169 Iqtkmalqdtfrnlkkegidrpqhdf h l 

19500 gaagcgtatattgaagggcggtttcatggcgagaatgcaaagattatgcccgatgaaactttggaaggcggtaactaa 19423 

197 E A Y I E O R FHGENAKIMPDEI LEGGN* 



26943 atqqcttatcaattagaagacttgttaaaaggtctagatgaaccaactatcaaacaggtgaaggaaattatttcgaaaacttcg 

1 M AYQLEDLLKGLDEPTIKQVKEIISKTS 

27027 aaagaactcgatgctaaaattttcattgacggcgacggtcaacattttgtacctcacgcacgtttcgatgaagttgttcaacag 

29 keldakifidgdgqhfvpharfdevvqq 

27111 cgcqatgcagctaacggctcaattaattcctataaagaacaagtcgcgacgctttctaaacaggtcaaagataacggtgatgcg 

57 r 9 daangsinsykeqvatlskqvkdngda 

27195 cagaccactatccaaaaccttcaagagcaactcgacaagcagtctcaacttgcaaaaggcgctgtgattacttcagctcttcat 

85 qttIQNLQEQLDKQSQLAKGAVITSALH 

27279 ccgttgattagtgaccccattgctccagcagcagacattcttggatttatgaaccttgacaacattacggtcgaaagtgacggc 

113 PLI SDS lAPAADIIiGFMNLDN ITVESDG 

27363 aaagttaaaggtcttgatgaagagttgaaagctgttcgtgagtctcgtaaatacttattcaaagaagtcgaagttcccgcagaa 

i 41 kvkgldeelkavresrkylfkevevpae 

27447 caagaggctcaagctaagtcgccagccgggactggaaatttaggaaatccaggtcgtgtcggtggtggtgttcccgaacctcgt 

169 Q i A Q A KSPAGTGNLGNPGRVGGGVPEPR 

27531 gaaatcggctcttttggtaagcaacttgctgctgctcaacaaacggcaggagcacaagaacaatcatcattctttaaataa 

Hi 11 eigsfgkqlaaaqqtagaqeqssffk* 

dplORF032 t agtttctagctat g t aggattcgaatgctggactgacgaagaatgcatcaggaactttgaacta 

i mkeanrlvssyvgfecwtdeecirnfel 

52117 gaccctgatatgtcaattgcgtctgcttatcatcgttattttgggatgctttattcctatgcaaaaaggtttaaatgcttatct 

29 dpBmsiasayhryfgmlysyakrfkcls 

52201 cgacatgacattgaaagcattgcattcgagactatttcaaaatgtttggcaacgttcaaatcaaaccaaggggccaagttttca 

57 r hdiesiafetiskclatfksnqgakfs 

52285 acttaccttacaagactcttcaagaatagaatagtcttagaatataggtacctaaatgcaccttccatgaatcgaaattggtat 

85 tyltrlfknrivleyrylnapsmnrnwy 

52369 gtaqaagtgacgttcgatagcgtttcgacaaatgaagaaggcgacgattttagtatcctatcgacagttggctattgtgaagac 

112 vevtfdsvstneegddfsilstvgyced 

S2453 tacaqaaaaattgaaattgaagcaagtcctgacctcatgacgctttctaatacagagtatgcttatatctcgtctgtcattcaa 

141 yG KIElEASLDFMTLSNTEYAYISSVIQ 

52537 aacqgtccttcagtaagcgacgcagaaattgcgcgtgaaattggagtaagcaggtctgctattagtcagtctaagaagtcacta 

169 mg psvsdaeiareigvs rsaisqs k_k>- s l 

52621 aaaaataaattaaaagattttatataa 52647 - 

197 KNKLKDFI * 
dplORF033 

7670 atqqcaaqacctaagttacctcaaattgatattcgagaagaagaaatacgagatgctcaagacgtagcagactcgtatggtgcg 

i mar pklpqidireeeirdaqdvadsyga 

7754 attatcaataaagcagtcgacgaaattgttgaagcagcttgcggttcacttgaccaggcaatggaagaaattcaaatagttgta 



WO 00/32825 PCT/I B99/02040 



376 

29 I INKVVDEIVEAACGSLDQAMEEIQI VV 

783B agccaaaatcctgtcattatggaagaccttaactactacattggctatcttcccactcttctttatttcgccgcagatagggcg 

57 SQNPVIMEDLNYYIGYLPTLLYFAADRA 

7922 gaaatggtgggaatacaaatggattcaagttctgctatcaggaaagaaaaatacgataatctacacattttagccgccgggaaa 

85 emvGIQMDSSSAIRKEKYDNLYILAAGK 

8006 actattcctgacaagcaagcagaaactcgaaaacttgtcatgaatgaagaagtcatcgaaaatgcttacaagcgagcctacaag 

113 TI PDKQAETRKLVMNEEVI ENAYKRAYK 

8090 aaagttcaattaaagctagaacaggccgataaggtattagcatctttaaaacgaattcaaacctggcaactagcagagttagaa 

141 kvQLKLEQADKVLASLKRIQTWQLAELE 

8X74 actcagtcaaataattcaaaaggagtattattaaatgcaaaaagacgtagacgtgaaaatgattga 8239 

169 tqsnnskgvllnakrrrrend* 

dplORF034 

131 atgagtcaaaacactacacgcaccgacgctgaattgacaggcgttactcttttaggaaaccaagacaccaaatacgattatgac 

i msqnttrtdaeltgvtllgnqdtkydyd 

215 ta taatccagacgtccttgaaactttccctaacaaacatcctgaaaacaattacctagtaacatttgacggatatgaattcact 

29 ynpdvletfpnkhpennylvtfdgyeft 

299 tccctttgccctaaaacaggacagcctgacttcgcgaatgttttcattagttacattccaaacgaaaagatggttgaatctaaa 

57 SLCP ktgqpdfanvfisyipnekmvesk 

383 tcattgaaattgtacttattcagtttccgtaaccacggtgacttccacgaagattgcatgaacattattttgaatgacttgtat 

es slklylfsfrnhgdfhedcmniilndly 

467 qaattqatggaacctaagtacattgaagtcatgggcctattcactcctcgtggtggaatttcaatttacccattcgtcaacaaa 

113 EL M E PKYI EVMGLFT PRGG I S I YPFVNK 

551 gtgaatcctcaatttgcaactcctgaacttgaacagcttcaacttcaacgcaaattgaacttccttggaaatgttcaaggtctt 

141 VNPQF atpeleqlqlqrklnflgnvqgl 

635 ggacgagctattcgatag 652 

169 G R A I R * 
dplORF035 

17425 atgcacctaatgaaggattcgaagatgttgaggacatggaagtccttagcattcgagttcgaaacgaaggtgaggacgacgagt 

x mhlmkdskmlrtwkslafefetkvrtts 

17341 gggttgaagttatcgcctgctatgaaaacgatgacgaggacgaagatttggaagggttataaaatgaaggtatttatcaacaat 

29 QLKLSPAMKTMTRTKIWKGYKMKVFINN 

17257 catactgaagctgatattgactacaaagatattctaaattttgtagcttatcgaaactctcctaaccctcaaattcaaatcact 

5 7 HTEADIDYKDILNFVAYRNS PNPQIQIT 

17173 agctggaacgctttgctttcctgctatacacggaatgagctttcttataaaggagtttcaataacggacttttttgaagccatt 

85 SWNALLSCYTRNELSYKGVS ITDFFEAI 

17089 caaactattgcaagttccttcactcacctagactcgaaaacaattgatacacaaaatgaaaagcgactcgaaaggattgaggaa 

113 QTIASSFTHLDSKTIDTQNEKRLERIEE 

17005 cttcagtcaagaataggtcattgtaactgtactatcgacgaacttaaaaaaggagtccacgaaatgccggatattgaatcagct 

141 LQ5 RXGHCNCTI DELKKGVH EMPDI ESA 

16921 atttcttaccagtacggacagattcttgcttatgaagatgaacttaattttctgctaaactaa 16859 

169 ISYQYGQI LAYEDELNFLLN * 
dplORF036 

48808 gtgttagtcgaacgaaaagccgacaaggaatgttgggaatggctagaagctgttcgagcaaatatagtcgaagaagttcgaaac 

1 VLVERKADKECWEWLEAVRANIVEEVRN 

48892 ggtcttagcattgttattgcttcgaatactgtcgggaatgggaaaactagctgggcggttcgacttttgcaacgctatttagca 

29 G L S IVIASNTVGNGKTSWAVRLLQRYLA 

48976 gaaactgcacttgacggaagaattgttgagaaaggaatgtttgtagtgtcagctcaactattgactgagttcggcgactataat 

57 ETALDGRIVEKGMFVVSAQLLTEFGDYN 

49060 tattttcaaaccatgcaagaatttctcgaacgtttcgagcgccttaagacttgtgagctattagtcatagacgaaataggtgga 

85 YFQTMQEFLERFERLKTCELLVIDE IGG 

49144 ggttccttaaccaaggcctcttatccttatctgtatgacttggttaattatagggttgacaataacttgtcgactatttatacg 

113 GSLTKASYPYLYDLVNYRVDNNLSTIYT 

49228 actaattatactgacgatgaaattattgaccttttaggccaaaggctttatagtcgtatatatgatacttcagtggttctagat 

14 1 TNYTDDEI IDLLGQRLYSRI YDTSVVLD 

49312 tttcaggcaagcaatgtaagaggattggaggtaagcgaaattgaatcatag 49362 

169 FQASMVRGLEVSEIES * 
dplORF037 

55855 atggtgaagaaattgaaatctaaaatctattcagttgcatatataattctagtagttattgcgaaccttgtgacaatttatttc 

1 MVKKLKSKIYSVAYI ILVVIANLVTIYF 

55939 gaacctttaaacgtgaaaggaattttaattcctccaagcagttggtttatgggattcactttcctgcttataaatctaataagc 

29 eplnvKGILIPPSSWFMGFTFLLINLIS 

56023 aagtacgagaagccaaaatttgcaggttctttgatatgggtagggttattccttacctcgttgatttgctttacgcaaaaccta 

57 KYEKPKFAGSLIWVGLFLTS LICFMQNL 

56107 ccacaatcgcttgtcgtggcttcaggagttgcattttggataagtcaaaaagcaagtgtctttatattcgacaagctctcgaat 

85 pQSLVVASGVAFWISQKASVFIFDKLSN 

56191 aaactagactcgaagattgcaaatgctttgtctagcaacatcggttctattatagacgcaaccatatggatttcattaggactg 

H3 KLDSKIANALSSNIGSIIDATIWISLGL 

56275 agtcctcttggaattggaacggttgcatatatagatattccgtcagccgtactaggccaagttctagttcagtttatcttgcag 

141 SPLGIGTVA YIDIPSAV LGQVLVQ FI L^g 

56359 tcaattgcttcgagatatttgaaaaagtag 56388 — 

169 SIASRYLKK* - 
dplORF038 

1350 atgagagtttctaaaaccttaacattcgacgcagctcatcaactagttggacattttggaaaatgcgcaaatttgcacgggcat 

1 MRVSKTLTFDAAHQLVGHFGKCANLHGH 

1434 acttacaaagtcgaaatttcattagcaggcggaacttatgaccacggttcgagtcaagggatggttgttgacttttatcacgtc 

29 TYKVEI SLAGGTYDHGSSQGMVVDFYHV 
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1518 aagaaaatcgcaggtacattcactgacagacttgaccacgctgttcttcttcaagggaatgaaccaatcgctttagcaaatgca 

57 KKIAGTFI DRLDHAVLLQGNE P IALANA 

1602 gttgacaccaagcgagttctatttggatttagaactacggctgagaatatgtcaagattccttacctggactctcacggagctt 

85 VDTKRVLFGFRTTAENMSRFLTWTLTEL 

1686 atgtggaagcacgctcgtatcgaccctatcaaactatgggaaactcctacaggttgcgcagaatgtacttactacgagattttc 

113 MWKHARIDSIKLWETPTGCAECTYYEIF 

1770 acagaagacgagattgaaatgttcaagaacgtaacctttatcgacaaagacgaaaagattactgtccgcgaaattttagagcag 

1 41 TEDEIEMFKNVTFIDKDEKITVREILEQ 

1854 gagcaggataatggttaa 1871 

169 E Q D N G * 
dplORF039 

3306 atgaataaaagtgcaaccttttggcttgttcgaacagctcttattgcggctctatatgtgacattgaccgttgcattttccgct 

1 MNKSATFWLVRTALIAALYVTLTVAFSA 

3390 attagttatggacctattcaattcagagtcagtgaagccttgattcttctacctttatggaaccatagatggactccggggatt 

29 ISYGPIQFRVSEALILLPLWNHRWTPGI 

3474 gtattaggaacaattattgcaaacttcttttcacctcttggactgattgacgttttattcggttcacttgctaccttccttgga 

57 VLGTI IANFFSPLGLIDVLFGSLATFLG 

3558 gtagtggcaatggtgaaagttgccaagatggcaagtcctctatattcacttatctgtccagttcttgctaatgcttaccttatt 

85 VVAMVKVAKMAS PLYSLI CPVLANAYLI 

3642 gcgctggaacttcgaatagtttactctttacctttttgggaatctgtcatctatgtaggaattagtgaagcgattatcgtttta 

113 ALELRIVYSLPFWESVIYVGISEA1IVL 

3726 atttcatacttccttatttccacgctggcgaagaacaatcattttagaacactgataggagcgaaaaatgggatttaa 3803 

141 iSYFLISTLAKNNHFRTLIGAKNGI* 
dplORF040 

7192 gtgagctatactggaaaaatgttcgaggaagactttttcgaaggtgcaaaagactttgagaaagatgctttcacggtccgtcta 

1 VSYTGKMFEEDFFEGAKDFEKDAFTVRL 

7276 tatgataccactaatggatttcgaggagttgcaaatccctgcgattatatagccgcaactaactttgggaccttgtttattgaa 

29 YDTTNGFRGVANPCDYIAATN FGTLFIE 

7360 ctgaaaactactaaagaagcttctttgagctttaataacatcactgataatcaatggttccagctatcacgcgcagatggatgc 

57 LKTTKEASLSFNNITDNQWFQLSRADGC 

7444 aaatttattctcgccggaattttagcgtatttccaaaagcatgaaaagattatatggtatccaatttcaagccttgaaaaaatt 

85 KFILAGILVYFQKHEKI IWYPISSLEKI 

7528 aaacggtctggagttaaaagcgtcaacccaaacttcatcgatgcagggtatgaagtttcttacaagaagcgtcgaactagattg 

H3 KRSGVKSVNPNFIDAGYEVSYKKRRTRL 

7612 accattcctttccaaaatgttctagatgcagttgagcttcactacaaggagaaaagcaatggcaagacctaa 7683 

14! TIPFQNVLDAVELHYKEKSNGKT* 
dplORF041 

8208 atgcaaaaagacgtagacgtgaaaatgattgaccctaaacttgaccgattaaaacacacaggtgattgggctgatgtacgaatt 

1 MQKDVDVKMIDPKLDRLKYTGDWVDVRI 

8292 agttctatcactaaaattgacgccgacagcgccgatgtctcaagatgtcgaaaagtgcttcaaaaggctcaagtatattcagtg 

29 SSITKIDADSADVSRCRKVLQKAQVYSV 

8376 gcggcaggtgaatgcattaaaattgcacacggatttgctcttgaacttcctaagggatatgaagcaatcttgcatcctcgttcc 

57 AAGECIKIAHGFALELPKGYEAILHPRS 

8460 agtctttttaagaaaactggtctaatcttcgtttctagcggagtgattgacgaaggttacaaaggtgacactgatgaatggttc 

85 SLFKKTGLI FVSSGVIDEGYKGDTDEWF 

8544 tcagtttggtatgctactcgtgacgcagatatcttctacgaccaaagaattgcccaatttagaattcaggaaaagcaacctgct 

113 SVWYATRDADIFYDQRIAQFRIQEKQPA 

8628 atcaagttcaatttcgtagaatcctcaggaaatgcggctcgtggaggccatggaagtacaggtgatttctaa 8699 

141 IKFNFVES LGNAARGGHGSTGDF* 
dplORF042 

48082 gtggcaaggcaaagaataggcaattcaggaaagcctaaaaatgaaattgaactaacattcaaagacaagcctaaaactcgttct 

1 VARQRIGNSGKPKNEIELTFKDKPKTRS 

48166 accttattcaagaaggacgtggcaacaggtctttcaaaagtcgagcatgattattttcaaatagttgaagcacttaacggaaaa 

29 TLFKKDVATGLSKVEHDYFQ I VEALNGK 

48250 caattcgaacctaatatgaagcaggtgtcatctttctttatagttcagtatgaatttattttcaatattaagtgcatcgattat 

57 QFEPNMKQVSSFFIVQYEFI FNIKCIDY 

48334 aactggttcaacttttcgagcactatgaaaaatgttcgaacttatttaaacattgagtcgaacattgaactttgtcgattttta 

85 NWFNFSSTMKNVRTYLNI ESN I ELCRFL 

48418 gctgaaagttttgttaaatatgaaaatgttcgaaaaagactgaacctaagcgaaaggttcataacggtctcgactttcaaaaga 

113 AES FVKYENVRKRLNLSERF I TVSTFKR 

48502 gcccggattttggacgaactcgaaggaaaaacgggttcaaaattcgaaggattttactag 48561 

141 AWI LDELEGKTGSKFEGFY* 
dplORF043 

31699 atgactaatattatcacagctgagcagtttaagcaacttgcatttcaaatcatcgcacttccaggattttcaaaaggtagtgaa 

1 MTNI ITAEQFKQLAFQI IALPGFSKGSE 

31783 cccatccatgttaaaattcgagcagcaggtgccatgaacctaatcgctaacgggaaaatccctaatacgcttttaggtaaagtg 

29 PIHVKIRAAGVMNLIANGKI PNTLLGKV 

31867 acagaactgtttggagaaacttcgacagtcactaaagacaatgctagtctagcatcaattactgaccaacagaagaaagaagcg 

57 TELFGETSTVTKDNASL ASITDQTQ K_JC~E A 

31951 ctcgaccgattgaacaaaaccgataccggtattcaagacatggctgaacttcttcgagtattcgcagaagettcaatggtagag 

85 LDRLNKTDTGIQDMAELLRVFAEASMVE 

32035 cctacttacgctgaagtcggcgagtatatgacagatgagcaacttatgacaatcttcagtgcaatgtacggtgaagtgactcaa 

H3 PTYAEVGEYMTDEQLMTI FSAMYGEVTQ 

32119 gctgaaacctttcgtacagacgaaggaaatgtctaa 32154 

141 AETFRTDEGNV* 
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dplORF044 

25666 atggtaagtgttttgattagcagcagcccctttttgaagtccctgcttcattttagctcgacaagtatttctaaatcgaataag 

1 MVSVLISSSSFLKFLLHFSSTSISKSNK 

25582 gttttcaatttccttgtttcctacataagtggtgaaccgataatggcacttaggacattcgaagaatctccactctacgccctt 

2 9 VFNFLVSYISGEPIMALRTFE BSPLYAL 

25498 ttcgatatgtttcgaaataatctgtttagatgtaaggtcgaacctatgctcacaatggtcacaattaaccttgaacgtctgggt 

57 FDMFRNNLFRCKVELMLTMVT INLERLG 

25414 cgactccttcttcggttggttgttcagtttgttctttttctttgtcatcaacttcgtcttcttcactcgtttcatcttgaggct 

85 RLLLRLVVQFVLFLCHQLRLLHSFHLEA 

25330 cctcttgttcgtttaattcgtttgctaacacaggcaatgctccagctgagatttcgtcaagctgagcaagttcttccaaaatgc 

113 PLVRLIRLLIQAMLQLRFRQAEQVLPKC 

25246 gttcccattccttgtccgccttttccttcttactga 25211 

141 VPIPCPPPPSY* 
dplORP045 

25340 atgaaacgagtgaagaagacgaagttgatgacaaagaaaaagaacaaactgaacaaccaaccgaagaaggagtcgacccagacg 

1 MKRVKKTKLMTKKKNKLNNQPKKESTQT 

25424 ttcaaggttaattgcgaccattgtgagcataagttcgaccttacatctaaacagattatttcgaaacatatcgaaaagggcgta 

29 FKVNCDHCEHKFDLTSKQI I S KHIEKGV 

25508 gagtggagattcttcgaatgtcctaagtgccattatcggttcaccacttatgtaggaaacaaggaaattgaaaaccttattcga 

57 EWRFFECPKCHYRFTTYVGNKEIENLIR 

25592 tttagaaatacttgtcgagctaaaatgaagcaggaacttcaaaaaggagctgctgctaatcaaaacacttaccattcatatcga 

85 FRN TCRAKMKQELQKGAAANQNTYHSYR 

25676 attcaggatgagcaagctgggcataaaatctcagggcttatggcgaagctaaagaaggagataaacattgaaaaacgagaaaaa 

113 IQDEQAGHKISGLMAKLKKE I NIBKREK 

25760 gaatgggtatctatatag 25777 

141 E W V S I * 
dplORP046 

42774 atgccaatgtggctaaacgacacagcagtcttgacgacgattattacagcgtgcagcggagtgcttactgtcctactaaataag 

1 MPMWLMDTAVLTTI ITACSGVLTVLLNK 

42858 ttattcgaatggaaatcgaataaagccaagagcgttttagaggatatctctacaactcttagcactcttaaacagcaggtcgac 

29 LFEWKSNKAKSVLEDISTTLSTLKQQVD 

42942 gggattgaccaaacgacagtagcaatcaatcaccaaaatgacgtcattcaagacggaactagaaaaattcaacgttaccgtctt 

57 GIDQTTVAINHQNDVIQDGTRKIQRYRL 

43026 tatcacgacttaaaaagggaagtgataacaggctatacaactctcgaccattttagagagctctctattttattcgaaagttat 

85 YHDLKREVITGYTTLiDHFRELSILFESY 

43110 aagaaccttggcggaaatggtgaagttgaagccttgtatgaaaaatacaagaaattaccaattagggaggaagatttagatgaa 

113 KNLGGNGEVEALYEKYKKLP I REEDLDE 

43194 actatctaa 43202 

141 T I * 
dplORF047 

47542 atgaaatttgaagatgaaaaacagttcatcgctgcaattgaagaagccggtgaattaaatgctaccaaaggcgacatggagaaa 

1 MKFEDEKQFIAAIEEAGELNATKGDMEK 

47626 caagtcaaaagtcttcgtgatgctctaaaagagtacatgaaagaaaatgacattgaatctgctcaaggtaagcacttttctgct 

29 QVKS LRDALKEYMKENDI ESAQGKHFSA 

47710 accttctacacgacagagcgctcaactatggacgaagaacgcttgaaagaaattatcgaaaaattagttgacgaagccgagacg 

57 TFYTTERSTMDEERLKEI IEKLVDEAET 

47794 gaagaaatgtgtgaaaaactttcagggcttatcgaatacaagcctgtcatcaatacgaaacttctcgaggatatgatttatcac 

85 EEMCEKLSGLIEYKPVINTKLLEDMIYH 

47878 ggcgagattgaccaagaagcaattcttccagcagttgtcatttctgttacagaaggcattcgttttggaaaggctaaaatttag 
47961 

113 GEI DQEAILPAVVISVTEGI RFGKAKI * 
dplORF048 

16709 atggaaacaacactttatttcggttatcttacagcagattggaaagacggtcacaagaactacactttccactatgaaagcatt 

1 METTLYFGYIiTADWKDGHKNYTFHYESI 

16625 cctgtaaaagaaactgagaaacaatataaggtcactggaatcaatcctaacttgtacttagacctaggctcagttattagaaag 

29 pvKETEKQYKVTGINPNLYLDLGSVIRK 

16541 agcgaacttgacattgcagtattcaaagcatgtcctgtcgctgaaactggagtcacacttactcgcgacatggaagttgatgct 

57 SELDIAVFKACPVAETGVTLTRDMEVDA 

16457 agaattgaaatcatcaagaaattaactacaagaaccgaacgccctaacgaaagaattaaagcaagaaatgaacaaggtaaacaa 

85 RiEIIKKLTTRIERLNERIKARNEQGKQ 

16373 gaaagccgccacctagtatctgcgctagaagattgcgctcgtcaaattgctggaatttatcaataa 16308 

113 ESRHLVSALEDCARQIAGIYQ * 
dplORP049 

44018 atgtttcaaccatttctcagcgagcatgtagccttggtcgtcaaagtagaaccaagacttgttttctccgatatactcgaactc 

1 MFQ PFLSEHVAIjVVKVEPRLVFFDI LEL 

43934 atcttttggataagttccgtttgctcgagcgtaccagaaaccagtagcatctttctgccagccaagtttcttctcagccggttg 

29 ifwISSVCSSVPETSSIFLPAKFLLSRL 

43850 agcatttgcgttagtcaagcgatagacgtagtagtaaggttgacctgcatagtaccaacgctcatcgtggtcgttgacggaaat 

57 SICVSQAID VVVRLTCIVPTLIVVV. DG^N r 

43766 tccgtcgtaggcgtagttgcagtgaatgatgttatcactgtcaatgaacatccctgtatgacctccagcgcctacgctagcacc 

85 svvgvvavndvitvnehpcmtssa-cast 

43682 tttgcgtccccagatgaagatgtcgcctcgtttagcatcccacggagcattttcactaattag 43620 

113 FASPDEDVASFSIPRSIFTN* 
dplORF050 

15081 atgaacaatcagcgaaagcaaatgaacaaacgaatcgtcgaacttcgcgaagactatcaacgtgcaagaggtcgaataaacttc 

1 MNNQRKQMNKRIVELREDYQRARGRINF 
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15165 
29 

15249 
57 

15333 
85 

15417 
113 

dplORFOSl 

29765 
1 

29849 
29 

29933 
57 

30017 
85 

30101 
113 

dplORF052 

30516 
1 

30600 
29 

30664 
57 

30768 
85 

30852 
113 

dplORF053 

50300 
1 

50216 
29 

50132 
57 

50048 
85 

49964 
113 

dplORF054 

14423 

1 

14507 
29 

14591 
57 

14675 
85 

14759 
113 

dplORF055 

27627 

1 

27711 
29 

27795 
57 

27879 
85 

27963 
113 

dplORF056 

19151 
1 

19067 
29 



379 

cttcttgctgtaaaggaccacggcgaagaactcgaaaaccccgaagcctttgtgggatacactgacaatctagccgaatgtttt 
L L A V KDHGEELENLEAFVGY I DNLVECF 
cctgaaagccaacgaaatgtcttgaggctatgtgtattagatgaccttccagccactaatgcggccgctgaaattggataccac 
PESQRNVLRLCVLDDLPVTNAAAEIGYH 
tatacatgggttcaccaacttcgagacaaagcagttgaaacacttgaagaaattttagatggggataacattattcgctctaaa 
W V H Q L R D K A V E T L E E I L D G D " * r " n v 



Y T 

cacggaatcgaaattaaggagaaacttgatgaattatatggtaaaagtcattctagttag 
HGIEIKEKLDELYGKSHSS 



R S K 



15476 



atgagttatgacgtgaattatgttaagaatcaagttqgtagagccattgaaaccgctcctactaaaatcaaggtacttcgaaac 
MSYOVNYVKNQVRRAI ETAPTKI KVLRN 
tcttgggtcagtgatggatatggaggaaagaaaaaggataaagcgaatgaagtcgtagcagacgaccttgtttgtttagttgat 
SWVSDGYGGKKKDKANEVVADDLVCLVD 
aattcaactgttcctgaccttttagccaattctactgacgcgggaaaaatttttgcccaaaatggagtgaaaattttcattcta 
NSTVPDLLANSTDAGKI FAQNGVKI FIL 
tatgatgaaggcaaaatcattcaacgagccgatactatcgaaattaaaaactcaggaagacggtacagggtagtagaaacccac 
YDEG KI IQRADTI E I KNSGRRYRVVETH 
aatcttctcgagcaagacattttgatagaacttaaattggaggtgaacgactaa 30154 
NLLEQDILIELKLEVND* 



atgactaaacgaacgacaatgatggacagattgaaggaaattcttcctacatttcagctctcgcctgctcctatgcttccagga 
MTKRTTMMDRLKE I LPTFQLS PAPMLPG 
gttgaatttgacgagcaagatacagataggccggatgactacattgttcttcgatatagtcatagaatgcccagcgcaacaaat 
VEFDEQDTDRPDDYIVLRYSHRMPSATN 
agcctaggaagttttgcttattggaaagttcaaatctacgtccattcaaactcaattattggtatcgacgaatatagcagaaag 
SLGSFAYWKVQIYVHSNSI IG IDEYSRK 
gttcgaaacattatcaaggacatgggctacgaagtaacctatgcagaaactggtgactacttcgacacaatgctttctagatac 
V R N I I KDMGYEVT YAETGDY " ° » v 

cgactagaaatcgaatatagaattccacaaggaggaaactaa 30893 
RLEIEYRIPQGGN* 



FDTMLSRY 



atgctaacattcgaaagaatagtatctatacgagcaccaacttgcatttcactcatttccccgctatatagaaggacatcatgc 

MLTFERIVSIRAPTCISLI S PLYRRTSC 

ccgttcttccaagcagttgcaagcattttatcaatagtccacgacttacctcgtccaggtcgagccattatgacaatcaaatcc 

PFFQAVASILSIVHDLPCPGRAIMTIKS 

tcaccaggaagtaagcctccaagcacgtcgtccaatagttcaaaccctgtcgatattccaagtctttcaccgtcatggtttcta 

SPGSKPPSTSSNSSNPVDIPSLSPSWFL 

atagtattcgcccagtctagtcgaagtttagcatttcgagcaatgtctagtccgcctacgaatttagagcgattgaaaagttct 

IVFAQSSRSLAFRAMSSPPTNLERLKSS 

tctagttttggaattatattcgcaatcgcaatgttactatctacttga 49917 

SSFG IIFAIAMLLST* 



atgtgtgaaaattgtcaaaacgaaacattcaatactagaattttcaatgaagatgaaagtggctatgtcgacgcctcattcact 

MCENCQNETFNTRI FNEDESGYVDASFT 

tacaaggagattcgcgacaccgcagcagctattagcaatcgagcggtagaaaagaaagaccgtgacagccttttagtcgctaca 

YKEIRDTAAAISNRAVEKKDRDSLLVAT 

gttatggctcttcccgtttctcacgcagaagatttaggcaagagactttgtattgcaaattctcgattggaagcatttcgtgaa 

VMALPVSHAEDLGKRLCIANSRLEAFRE 

gctgttcaagaggctctcgagaatgaaaaggctgaagatttaaaggacgttatcttaggtcttatcgacgttgacaaaaaaatt 

AVQEALENEKAEDLKDVI LGLIDVDKKI 

ggcaaccttgcattgcaattagctgaatcaggagcattataa 14800 

GNLALQLVESGAL* 

atgcctaatgtgcgagttaagaaaactgattttaatcaaaccactcgaagcattgtcgcaattcctgaccactacgttgctttg 

MPNVRVKKTDFNQTTRSIVAI PDHYVAL 

gctgctcaaattccagctaccgcagcaactcaagtagggaacaagaaacacattcttgccggaacttgcgtgaaaaatgctact 

AAQI PATAATQVGNKKYI LAGTCVKNAT 

acatttgaaggacgcaaaactggactcgaagtagtatctaccggtgaacaattcgacggagttatcttcgctgaccaagaagtg 

TFEGRKTGLEVVSTGEQFDGVIFADQEV 

tttgaaggtgaagaaaaagtaaccgtgacagtattagttcacggattcgtcaaatatgcagcccttcgaaaagttggcgatgct 

FEGEEKVTVTVLVHGFVKYAALRKVGDA 

gtgcctgaatctaaaaacgcaatgattcttgtcgttaaat ag 28004 

VPESKNAMILVVK* 

atggaaaataaatggaaagttatccattttcaaaactcatgtattaaacaagtagacgatgaaaaaaggaggctcctgttcgaa 
MENKWKVIHFQNSCIKQVDDEKRRLLFE 
gttccaggaactccttatcgtctacaagtttgggtgaaaatgagcttagttaaaattgaaacacgcgcaggaaatggctattat 
V P G T P Y R L Q V W V K M S l» V K I E T R 



A G N G Y Y 



18983 aaaaggctagtatgccaagacgattttgtattttatggtaaggagtcaatagatggttacttaattgacgccaccataactggc 

57 KRLVCQDDFVFYGKESIDGYLIDA^TITG 

18899 aaatctttggcggaatattgtgagcctatgaacaggcatattctcgaaactattgcatcgcgagaagcagctgaactgaacaga 

85 KSLABYCEPMNRHIL ~ * * " * " " 

18815 gctaaaaagcaagaccaacagaaatggagatactag 18780 

113 AKKQDQQKWRY* 
dplORF057 



TIASREAA 



L N R 
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9859 atgcaaaaatctctatttggacctaagctagtgcctgctagttcaaggcgcaagaaaagaacggttccaaaacctaaacctaaa 

X MQKSLFGPKLVPASSRRKKRTVPKPKPK 

9943 atcgatgagcaagtggttgagcttatgaaccgcagagagcgtcaagtgcttgttcatagttgcatctattattattttaatgac 

29 IDEQVVELMNRRERQVLVHS C IYYYFND 

10027 tcaattatagcagacgggcagtatgacaaatggagccacgaactatattctcttatagtttcgcaccctgatgagtttcgacag 

57 SIIADGQYDKWSHELYSLIVSHPDEFRQ 

10111 actgttctctataacgagtttaaacagtttgacggaaatactggaatgggtcttccatacgactgtcagtttgctgtaagggtc 

85 TVLYNBFKQFDGNTGMGLPYDCQFAVRV 

10195 gcagaaaggcttttaagaaaatga 10218 

113 AERLLRK* 
dplORF058 

15633 atgacat cacgcgcat acaaaccaat t cccacgcgcagagct agt gc t aaacaagagaaggcagt tgc t aagcagt tgggagga 

1 MTS RAYKPI PTRRASAKQEKAVAKQLGG 

15717 aaagtacagcctaattcaggagccactgactactacaaaggtgacgtcgtaacagactcaatgcttatagaatgcaagacagtt 

29 KVQPNSGATDYYKGDVVTDSMLIECKTV 

15801 atgaagccacaaagttcagtcagcttgaaaaaggaatggttcctaaaaaatgaacaggaaaggttcgctcaaaaactcgactat 

57 MKPQSSVS LKKEWFLKNEQE RFAQKLDY 

15885 tctgctatcgctttcgactttggtgacggaggcgaacagtatatagcaatgtctataagtcagttcaagcgaatattagaggat 

85 SAIAFDFGDGGEQYIAMSI SQFKRILED 

15969 agaaatgataaccttatttaa 15989 

113 R N D N L I * 
dplORF059 

30154 atgtctcagcctgaattagtatggaagcctgaagaatttgttagtaactgtgaacggtatcgaaacaagtttcaagtcgctgtc 

1 MSQPELVWKPEEFVSNCERYRNKFQVAV 

30238 acaacagtctgcgaagtcgctgctactaagatggaagaatacgcaaagacgcatgctatttggacagaccgtacagggaatgct 

29 ITVCEVAATKMEEYAKTHAI WTDRTGNA 

30322 cgacagaaactcaaaggagaagctgcttgggtaagcgcagaccaaatcatgatagctgtatcacatcacatggactacgggttt 

57 RQKLKGEAAWVSADQIMIAVSHHMDYGF 

30406 tggctagaactagctcatggtcgaaaatacaaaattctcgaacaggctgtagaagacaatgtcgaagaactttttagagcgttg 

85 WLELAHGRKYKI LEQAVEDNVEELFRAL 

30490 agaaggttattagactag 30507 

113 R R L L D * 
dplORF060 

38070 gtgatagctgtatctgctatccctactccgctctttccaggtacaccgtcgactccatcacgcccaggagctcccggtaaacct 

1 VIAVSAIPTPLFPGTPSTPSRPGAPGKP 

37986 gcgtcacctttaggaccttctagtcgaatccatgtaaagtcgtcaggaactaattcgctcggtttcttattagtattaaggaca 

29 ASPLGPSSRIHVKSSGTNSLG FLLVLRT 

37902 ccaatgtatttcccagattctgcattaaaattagtccctaaaatgtcatctgcgtatctaataacaacttgggactcatttaca 

57 PMYFPDSALKLVPKMSSAYL I TTWDSFT 

37818 gtttcccctgaaaggactccttcgccgtcctcatttagcaagtccatcaagtcttttcgagggtcttggaaaatgatagtagag 

85 VSPERTPSPSSFSKSIKS. FRGSWKMIVE 

37734 tttgaaaggtcgtcgtag 37717 

113 F E R S S * 
dplORF061 

19475 atggcgagaatgcaaagattatgcccgatgaaattttggaaggcggtaactaaaatgaaattcgaagtttattctgcgcgacta 

1 MARMQRLC PMKFWKAVTKM K F EVYSARL 

19391 tttgacgaagaggcgacatatgataggtatcgtgaagcactagagaaagttggaaatgtcgcttacttttgtgaaattgatact 

29 FDEEATYDRYREALEKVGNVAYFCEIDT 

19307 ggcaaccttgtaaccgaactcgagccagacagcctagatgacctaatcgcgctttcaaatgtagtgggaactggactaaaatta 

57 GNLVIELELDSLDDLIALSNVVGTGLKL 

19223 tcacggccttatagagaagataagcctcctcaattatggattgtcgacgggtacatggaataa 19161 

85 SRPYRBDKPFQLWIVDGYME * 
dplORF062 

4 52 84 gtgagaagcttcaatcaattccattgcggtgtcaatatcttcttccttgacgagtttaaaaattccgtcaatcgcccattcgta 

1 VRSFNQFHCGVNI FFLDEFKNSVNRPFV 

45200 agatgcaggagcaatagatgcaagaagtttcttttggtcttctgtcaacccttttgcgcgaactccaatagaaacaccttttcg 

29 RCRSNRCKKFLLVFCQPFCANSNRNTFS 

45116 agtttcttcgatagtaacgaagttcttcttagagcgataggcgatgtaagactgagtgacgattcgagtagacgcaggaaaggc 

57 SFFDSNEVLLRAIGDVRLSDDSSRRRKG 

45032 ttcaacaattcgactttcaagagccttagtaatcgccatcacgctttcttttttaggagcaggttttcgaacagtagatttctc 

85 FNNSTFKSLSNRHHAFFFRSRFSNSRFL 

44948 actaactga 44940 

113 T N * 
dplORF063 

47200 atgaaattcactgaaggaaaaaattggtataaagttggagagatatgtcaaatgttgaaccgctctctatctacgattaatgtt 

1 MKFTEGKNWYKVGEI CQMLNRSLSTINV 

47284 tggtatgaagcaaaagacttcgctgaagaaaacaacattcacttcccgtttgttcttcctgaacctagaacagaccttgaccat 

29 WYEAKDFAEENNIHFPFVL PE P R T D L D 

47368 cgcggttctcgattctgggatgacgaaggcgtgaacaaactcaaacgatttagggacaacctaatgcgcggtgajsttggcactc 

57 RGS RFWDDEGVHKLKRFRDNLMR G^D L A F 

47452 tacactcgaactcttgtagggaaaactgaaagggaagcaattcaagaagatgctaaagcacttaaacgtgaacatggattggag 

85 YTRTLVGKTEREAIQEDAKAFKREHGLE 

47536 aattaa 47541 

113 N * 
dplOR7064 
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29108 at gg = t aca ttg aaa g ct= t ta g ca=^ 

29192 c^ctL.tct^t.aa^^ 

H„ 6 acaatt.aaLaUaagaa^ 

ll 36 0 Uci^ctLggaa^^ 

pc LARANGlDISSIbKi^&ni 



85 LA 
29444 gagtaa 29449 
113 E * 

dplORF065 



29 VFIKc l« r x\ « . «. M «.«.f^ 3 »- a nAprrrhrcat'cattitcaacctaa 



29 

51329 gccttgcctaactacttcgctagatgttccaaaatt 



ttttcagccactggtttccatagaaccctccatcgtttcgacctaa 



I"" ALPNYFARCSKIPFQPI'VSIEPSIVST* 

^r 0< \t g a==aa= Cg c g t=a gg t gg aa g caatacca=tt tr 

28814 "ica^t^^ 

" 7 30 

^8646 tcggttataagtgct^gttcaagaccatto^tagggcgaacacctgta^attttcgatgtcatccattgctgctaa 
" S " LVISVSVQDHSSRAHTCTIFDVIHCC. 
^"^tgacgattcgagtagacgcaggaaaggcttcaacaatt^actttcaagagccttagtaatcgceatca^ 



45061 fW"^— TT IR LSR ALVIAITLSFL 

aggacttctttttt^^ 



^ 06 \ tgg ca g c t caaac gg a= rr ^ 
29535 Uctc4. 3 .c.. 9r ..a r ^ 

li-03 rT aac Ttg aaaa g ^ 
2-£r^t g aaa= Ct tatcac g =c^^ 
20327 U-U™^^^ 



20243 -ty-f^^^ 

20159 tatgacaagacaattgaagtagacgacattgacttttcgaaagct^aaaatatgaeagaaagtga 
YDKTIEVDDIDFSKARKYDRK 



20094 



i 6057 -LaLa a 3 a C ^ 

2 i 6 i4i ^ TT . r ^^ 

E« h c TTTTTTTTrn s TTTTrrr g 16284 

^'^aaaca.^^ 

Use ^cLtaeactcacctc*^ 

3907 2 ^tatcUa^ 

391SC aa ggg a«ccc ggg aa g cca gg c g ca g ac gg taa g actaattatttccatata g 39209 
85 KGYPGSQAQTVRLIISI 

till?" 2 atgttccttcgtcttcaagt^tcccgaaagtttttcaattatttgttcaggagtcgcttcaattegaagaccatttactctca 
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1 MFLRLQVVSKVFQL-FVQES LQ FBDHLLS 
50961 tcaaaatgcttcaactccttcccttgtaaccttacttcgaagacgagcagtcgacctagaggcttttgctttcaatggagagct 

29 SKCFNSFPCNLTSKTSSRPRGFCFQWRA 

50877 ttcgcctttttcagttccttcttcgccttcctctttgaatcctataagagtataggttccagtttcaacgtcccacatatatcc 

57 FAFFSSFFAFLFESYKS IGSS FNVPHIF 

50793 gatgatttttcggtcttcgccatatcggtttttaacgacagatag 50749 

85 DDFSVFAISVFNDR* 
dplORF073 

14262 gtgaacgcttgccggaagaatacgacgaagaaacttgggaacctatcactgaagcagaatacatcaagcgaacagaaaaaccta 

1 VNACRKNTTKKLGNLSLKQNTSSEQKNL 

14346 aagcagttgcaaaacctactcgaaaaactccagcgccttctcgtcgccctcgcccttaaaagaaaggttgaaataaaatgtgtg 

29 KQLQNLLEKLQRLLVALALKRKVEIKCV 

14430 aaaattgtcaaaacgaaacattcaatactagaattttcaatgaagatgaaagtggctatgtcgacgcctcattcacttacaagg 

57 KIVKTKHSILEFSMKMKVAMSTPHSLTR 

14 514 agattcgcgacaccgcagcagctattagcaatcgagcggtag 14555 

85 RFATPQQLLAIER* 
dplORF074 

32298 gtgacgaaaagaaaaatccaggattgcaaatgcttatggagtgactattttcagtcgctcctctttttgtatatagaaaggaaa 

1 VTKRKIQDCKCLWSDYFQSLLFLYIERK 

32382 ttacatggattttgggtcaattgcagcaaaaatgactttggatatctcaaacttcacaagtcaattaaatcttgctcaaagtca 

29 LHGFWVNCSKNDFGYLKLHKS I KSCSKS 

32466 agcgcaacggctcgcactagagtcctcgaagtcctttcaaattggttctgctttaacaggattagggaaaggacttacgactgc 

57 SATARTRVFEVLSNWFCFNRIRERTYDC 

32550 ggttacccttcctcttatgggatttgcagccgcctctattaa 32591 

85 GYPSSYGICSRLY* 
dplOR7075 

22447 atggcaaagttttgtccgttgaattccgtcatggcccaaagggaaaatgaaagagccatcgatactgtttttcctgaacgaatg 

1 MAKFCPLKSVNAQRENERA I DTVFP ERM 

22363 gaaccgtctgctatgacgatatcgaaagttcgaaaaggtgagccctttgtccaccatgttaggagctggagctgtttcttacta 

29 EPSAMTISKVRKGEPFVHHVRSWSCFLL 

22279 aaagggacgaagttgaacttaggtagtttatttctcaggcttattgtcattatcagtcactcctttaatgtaggaacctgttgc 

57 KGTKLNLGSLFLRLIVI ISHSFNVGTCC 

22195 gtcactaaattcttgccaaacggcttgagctgctttatctag 22154 

85 VTKFLPNGLSCFI* 
dplORF076 

5728 gtgagagcattttcttcactcacgtcttcgagcaagtggtcgaatgtagggtactcttcatcttctgtaacaatatcaatattg 

1 VRAFSSLTSSSKWSNVGYSSSSVTIS IL 

5644 tactcaccattcccaataacttttagcgaagattcttcaggaactaatgtgacggttgcggccgtggtcttttctacaagtttt 

29 YSPFPITFSEDS SGTNVTVAAVVFSTSF 

5560 ccaaactgctctgctttcacaatcacgtcaatttcaacatcgctgtcgataatgcatcgaaggaagtttgagccatcatacgct 

57 PNCSAFTITSISTSLSIMHRRKFEPSYA 

5476 gtaaacatgacgcattcgccgtcaccaaaaatatgccaatag 5435 

85 VNMTHSPSPKICQ* 
dplORF077 

14 BOO atggaacgaataaagacgctatttcacgcgatttatgctaacggcactcatttagaagtagcagctttgttcgataccgttgat 

1 MER I KTLFHVIYANGTHliEVAALFDTVD 

14884 gattatgatgacgttatagaggacatccaggggtatattgatacccctgacctttataatcaaaggagcattagaatggcgcct 

29 DYDDVIEDIQGYIDTPDLYNQRSIRMAP 

14966 tacaatcctgacatcaatggtgacgctattgctactgacattttactacgactagatgatattatctacgtcgacgcaacttgt 

57 YNPDINGDAIATDILLRLDDIIYVDATC 

15052 gaaactattaaatacgaggagcctattgcatga 15084 

85 ETI KYEEPIA* 
dplORF078 

17507 atggcaacagtaaaggaaacagtaaaatttgacggacgtcttgtaactatcttcgactacgacgatttagagtgggaaggatat 

1 MATVKETVKFDGRLVTI FDYDDLEWEGY 

17423 gcacctaatgaaggattcgaagatgttgaggacatggaagtccttagcattcgagttcgaaacgaaggtgaggacgacgagtgg 

29 APNEGFEDVEDMEVLSIRVRNEGEDDEW 

1733 9 gttgaagttatcgcctgctatgaaaacgatgacgaggacgaagatttggaagggttataa 17280 

57 VEVIACYENDDEDEDLEGL* 
dplORF079 

35288 atggaactgataccattgataaatcctcgaacaaggttgacccctgcgcttaccatttgtccagcgaatccagtaaccttagaa 

1 NELI PLINPRTRLTPALTI CPANPVTLE 

35204 acaattgaagctcccatgctgccaatcttagagacagctgaaccaatcattgacccaataccactaatgaagtttcgaatcagg 

29 TIEVPMLPILETAEPIIDPI PLMKFRIR 

35120 ttcgcacctcctgaaaccatctgtcccacaaagctagcaatcttgctaactaatgatgaaagcatgtttccagctgtcgataaa 

57 FAPPETICPTKLAILLTNDESMFPAVDK 

35036 agtgagccgagaagtgaagcaataccttga 35007 

85 SEPRSEAIP* 

dplORF080 ....... 

42490 atgttgaaccttacaaaatcgcgccaaattgtggcagagttcactattggacaaggagccgaaaagaaacttgfecaaaacaacg 

1 MLNLTKSRQIVAEFTIGQGAEKKL-VKTT 

42574 attgtgaacattgatgcaaacgcagtaccaaccgcctccgaaactcttcatgacccagacttgtatgctgcgaaccgtcgagaa 

29 IVNIDANAVSTVSETLHDPDLYAANRRE 

42658 cttcgagctgacgagcaaaaacttcgcgaaactcgttacgcaatcgaagatgaaattctagctgaacagtcaaagactgaaaca 

57 LRADEQKLRETRYAIEDBILAEQSKTET 

42742 gctctaacagctgaataa 42759 
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85 A L T A E * 
dplOR7081 

55466 atgttcaggaacagtatcgtccatctgttggtctgcgtcaaagttaaaggggtcgaaatcttcgttcttgctagcgtcgatata 

1 MFRNSIVHLLVCVKVKGVEIFVLASVDI 

55382 ctcgaactcgtactcaggaagactcataccaggaagccttcttcttcgaccggtagctgtttgaacatatcccaagccctgcgc 

29 LELVFRKTHIRKPSSSTGSCLNISQVLR 

55298 ccgctgttgaacgaatatgatatagtctgccactttagggaactcggtgaagaaatcttcaacaaccttattcgcttctttgac 

57 LLLNEYDIVCHFRELGEEIFNNLIRFFD 

55214 agatacattcatctgctcagcgattga 55188 

85 RYIHLLSD* 
dplORF082 

44728 gtgaacttcacctttcagcttcaactctcgaacgttggtacacaatggaagatgaaactgaacctaaaaaagaagaagctgcta 

1 VNFTFQLQLSNVGTQWKMKLNLKKKKLL 

44812 aacccgctaaaaaggctgctcctgcagttgctcgacctgctcgaaaaggtagagtcgttcccaaacctaaaaaagaagtccttg 

29 NLLKRLLLQLLDLLEKVESFPNLKKKSL 

44896 aggaagaaattcctgaagttaaggaacagccggaagaagttggttcagttagtgagaaatctacegttcgaaaacctgctccta 

57 RKKFLKLRNSRKKLVQLVRN LLFENLLL 

44 980 aaaaagaaagcgtga 44994 

85 K K K A * 
dplORF083 

35974 atgccttcagggtttttaaatcctgagtccttaaatcctgcgaaagtgagtcctacatattctagcacggttgcacctttgtcg 

1 MPSGFLNPESLNPAKVSPTYSSTVAPLS 

35890 acaaggtcaattccgtcgaccaatagcgtctgtctgctagccatctatttctcctttacggtgttacaatgttaccaaaccctg 

29 TRS I PSTNSVCLLAIYFSFTVLQCYQTL 

35806 atagagtttctttacttctattatacaatcctctcgacagtttgtcaacgtcgtcattgtttcgaactacgattgttccaatgt 

57 I E FLYFYYTI LSTVCQRRHCFELRLFQC 

35722 tga 35720 

85 * 
dplORF084 

15445 atgaatcatatggtaaaagtcattctagttagtgtctttgtactgtcagccttttgcatgacttgctcaatggtttatttggtt 

1 MNYMVKVI LVSVFVLSAFCMTCSMVYLV 

15529 acaggtaagcaagaggaccaccgtagtaccgtcgcccttgtatttggcgctctcgtaagctctgcggcgttctattcgacactc 

29 TGKQEDHRSTVALVFGALVSSAAFYSTL 

15613 tttatcctcgcctatctgccatga 15636 

57 FILAYLP* 
dplORF085 

10847 gtgatgactataatcaaggactttttcgagccttgtgatactgtcacgcattcctccatttgcaagtttcccaataaacgaaag 

1 VMTIIKDFFEPCDTVTHSSI CKFPNKRK 

10763 ggcgtcacgcccataactataaccagctccttcttcattttcactttcgacaataaattgaagttgattaacgatgtcgtcatt 

29 GVTLITITSSFFI FTFDNK LKLINDVVI 

10679 atcaattcgagtaaagtcaaaccgttgaactcgactgagaatagtgtcaggaatcttttgagggtcagtagtacatag 10602 

57 INSSKVKPLNSTENSVRNLLRVSST* 
dplORF086 

52760 atatgggaaaagtatcaattcaaaaatcaggaacatttagctcagggtctaataacgagtttttcacactcgctgaccacggtg 

1 IWEKYQFKNQEHLAQGLITS FSHSLTTV 

52844 acagcgcaattgtcactctattgtatgatgacccggaaggcgaagacatggattatttcgtag 52906 

29 TAQLSLYCMMTRKAKTWI IS * 
dplORF087 

30036 acgattttgccttcatcatatagaatgaaaattttcactccattttgggcaaaaatttttcccgcgtcagtagaactggctaaa 

1 MILPSSYRMKIFTPFWAKIFPASVELAK 

29952 aggtcaggaacagttgaattatcaaccaaacaaacaaggtcgtctgctacgacttcattcgctttatcctttttctctcctcca 

29 RSGTVELST KQTRSSATTSFALSFFFPP 

29668 tatccatcactgacccaagagtttcgaagtaccttgattttagtaggagcggtttcaatggctctacgaacttga 29794 

57 YPSLTQEFRSTLILVGAVSMALRT* 
dplORF088 

5040 atgaaaaaagttcaaacttatcaagaatatctaaaactagttgagttcaaacgtcaactttctttaaatcttcgagaaggaaaa 

1 MKKVQTYQEYLKLVEFKRQLSLNLREGK 

5124 ataggagtcgatgaagcggttattcaattattcaccttctatagtttcaacaatatcgaggaacctcctttcattgtactcaaa 

29 IGVDEAVIQLFTFYSFNNIEEPPFIVLK 

5208 atgcaagaggctgccgtgaacgggacttatgaagcaaaactcaatatgcttaaaagatttaaaattatttag 5279 

57 MQEAAVNGTYEAKLNMLKRFKI I * 
dplORP089 

12495 atgtcaatcatgtcgctatcaacagtcgagtatttagacacaaaatgccttttcaactgcgcgtcagtcattttctcaaactca 

1 MS IMSLSIVEYLDTKCLFNCASVI FSNS 

12411 acacaattatcaggaaaggccctcagcaactcgcttcgcttgccaattttagtaaccatcaaaacaagtgtcccatatctaaca 

29 TQLSGKAFSNLLRLSILVTI KTSVPYLT 

12327 tccggaagccttttccacctcgactcattagacagaaactccttatcatctcgaacagcgaatattcgatga 12256 

57 SGSLFHLDSLDRNS LSSRTAN I R* 
dplORF090 ~ ** 

27037 atgctaaaactttcattgacggcgacggtcaacattttgtacctcacgcacgtttcgacgaagttgttcaacagcgcgatgcag 

1 MLKFSLTATVNILYLTHVSMKLFNSAMQ 

27121 ctaacggctcaattaattcttataaagaacaagtcgcgacgctttctaaacaggtcaaagataacggtgatgcgcagaccacta 

29 LTAQLILIKNKSRRFLNRSKITVMRRPL 

27205 tccaaaaccttcaagagcaactcgacaagcagtctcaacttgcaaaaggcgctgtga 27261 

57 SKTFKSNSTSSLNLQKAL* 
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dplORF091 

4 31 89 atgaaactatctaacgaacaatatgacgtagcaaagaacgtggtaaccgtagtcgtcccagcagcgattgcactaactacaggt 
1 MKLSNEQYDVAKNVVTVVVPAAIALITG 
43273 cttggagcgttgtatcaatttgacactactgctatcacaggaaccattgcacctcttgcaacttttgcaggtactgttctagga 
29 LGALYQFDTTA I TGT I ALLAT FAGTVLG 

43357 gtttctagccgaaact accaaaaggaacaagaagct caaaacaatgaggtggaataa 43413 
57 VSS RNYQKEQEAQNNEVE* 

dplORF092 

46989 atgaaaactatctccatattaaggaaagacactaaaaggaagccggacaggaacggaagaaaaactgcactcgaactagctcaa 
1 MKTISILRKDTKRKPDRNGRKTALELAQ 
47073 gagattgatatgtcacctagtgagttagcagagctccttcaaattcctgaaaggacggcaaccagaattttaaaactcgacaaa 
29 EIDMSPSELAELLQI PERTATR ILKLDK 

47157 ctgctcaacaaagagcaatgctcaataatagaaaggtatataaatgaaattcactga 47213 
57 LLNKEQCS I IERYINEIH* 

dplORF093 

45756 atgcaacatacgattaaacaatgtttgaaacttgccttcctgctaactgcaatatcaattgcctgtttagttttccctaaacct 
1 MQHTIKQCLKLAFLLTAISIACLVFPKP 
45672 tgctcatcgcctaaaaggaaacatggatgctcttgtgcgtattcgaaacattcaacctggtgcgcgaatggagtagtcttgaac 
29 CSS PKRKHGCSCAYSKHSTWCANGVVLN 

45588 gaaaactgctcattgcttgaagaagctattcggtttcgagagtcaatgtag 45538 
57 ENCSLLEEAIRFRESM* 
dplORF094 

8261 atgtacgaattagttctatcactaaaattgacgccgacagcgccgatgtctcaagatgtcgaaaagtgcttcaaaaggctcaag 
1 MYELVLSLKLTPTAPMSQDVEKCFKRLK 

8365 tatattcagtggcggcaggtgaatgcattaaaattgcacacggatttgctcttgaacttcctaagggatatgaagcaatcttgc 

29 YIQWRQVNALKLHTDLLLNFLRDM KQSC 

8449 atcctcgttccagtctttttaagaaaactggtctaa 8484 

57 ILVPVFLRKLV* 
dplORF095 

8877 gtgggaaaactacttcagctctcgacattgtcaagaatgcgcaaatggtatttgagcaggaatgggaacagaagactgaagaac 

1 VGKLLQL STLSRMRKWYLSRNGNRRIjKN 

8961 tcaaggaaaagctggaaaatgcgcgtgcatccaaagctagcaagactgctgtcaaggaacttgaaatgcaactcgatagtcttc 

29 SRKSWKMRVHPKLARLLSRNLKCNSIVF 

9045 aagagcctcttaagattgtatatcttgaccttgagaatacattag 9089 

57 KSLLRLYILTLRIH* 
dplORF096 

46681 gtgattcataaattcttcaatttcgttgaacttatctgcggtttctcctgttaccaggttgcatttgactgtcttcgaaagtat 

1 VIHKFFNFVELICGFSCYQVAFDCLRKY 

46597 cttagcaagaggttcaataaccttttcccaattgctaaatatcacgcaggactttccttgctggatacattcctcgacaatttc 

29 LSKRFNNLFPIAKYHAGLSLLDTFLDNF 
46513 gatacatctttcgaacttgcaagacttgacatcttgagtagttaa 46469 
57 DTSFELARLDILSS* 
dplORF097 

39100 atggacgggattgaaatcttgatactgaccgacgtatgctcgtccgctgtcagtatgactaaatccctcaccgtttggactatt 

1 MDGIEILILTDVCSSAVSMTKSLTVWTI 

39016 agagaaagcgaggtgagtacattgcgaacgtccgtcagctcctgcaggtccaggaattccttgaagcccttgaggaccttgaag 

29 RESEVSILRTSVSSCRSRNSLKPL RTLK 

38932 accttgaactcctctaggacctgtttcacctatcttggaaactga 36668 

57 TLNSSRTCFTYLGN* 
dplORF098 

43627 gtgaaaatgctccgtgggatgctaaacgaggcgacatcttcatctggggacgcaaaggtgctagcgcaggcgctggaggtcata 

1 VKMLRGMLNEATSSSGDAKVLAQALEVI 

43711 cagggatgttcattgacagtgataacatcattcactgcaactacgcctacgacggaatttccgtcaacgaccacgatgagcgtt 

29 QGCSLTVITSFTATTPTTEFPSTTTMSV 

43795 ggtactatgcaggtcaaccttactactacgcctatcgcttga 43836 

57 GTMQVNLTTTSIA* 
dplOR7099 

38298 atgcaagttcgccatctgctactgaagctccagctggtggatggtctacgcaagttcctaccgtcccaggtggtcagtatttat 

1 MQVRHLLLKLQLVDGLRKFLPSQVVS IY 

38362 ggactcgaacaagatggcgctacactgaccaaactgatgaaattggatattcagtttcaagaatgggcgagcagggtcctaaag 

29 GLEQDGATLTKLMKLDIQFQEWASRVLK 

38466 gtgacgcaggtcgtgacggtattgcaggaaagaacggaatag 38507 

57 VTQVVTVLQERTE* 
dplORFlOO 

1597 atgcagttgacaccaagcgagttctatttggatttagaactacggctgagaatatgtcaagattccttacctggactctcacgg 

1 MQLTPSEFYLDLELRLRZCQDSLPGLSR 

1681 agcttatgtggaagcatgctcgtatcgactctatcaaactatgggaaactcctacaggttgcgcagaatgtacttactacgaga 

29 SLCGSMLVSTLSNYGKLLQVAQNVL TT _R_ 

1765 ttttcacagaagacgagattgaaatgttcaagaacgtaa 1803 - 

57 FSQKTRLKCSRT* 
dplORFlOl 

19220 gtgataattttagtccagttcccactacatttgaaagcgcgattaggtcatctaggctgtctagctcgagttcgattacaaggt 

1 VII LVQFPLHLKAR' LGHLGCLARVRLQG 

19304 tgccagtatcaatttcacaaaagtaagcgacatttccaactttctctagtgcttcacgatacctatcatatgtcgcctcttcgt 

29 CQYQFHKSKRHFQLSLVLHDTYHMSPLR 
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19388 caaacagtcgcgcagaataaacttcgaatttcactttag 19426 
57 QIVAQNKLRISF* 



illf™ 2 atgataacgtgggaatgtttgactgtatcgccgaactcgataaaattc^^ 

1 MIT W E CLTVSPNSIKPLVYLDSLRHVNS 

4118 ttttggaagcaccacaaatttcttgggataattatctatacatgcgcgagcgaatggttgagaaagacaagcccttacctattt 

29 FW K HHKFLGIIIYTCASEWLRKTSSYLF 



4202 tccatatgggagaagactttaaatggctcaacttga 4237 

57 SIWBKTLHGST* 



4^352 ttgaatcatagatatagtaacatcacaactatttttctttggcagattgtctttctttgtatttgctgcgcggtgtcctattgt 
1 L NHR YSNITTIFLWQIVFLC I CCAVSYC 

gcaggagtgcataatgagcgagagtctcaagataaggtgattcaaagttataagcagaaagaaaagtcagccgtctacttgaca 
AG VHNERESQDK VIQSYKQKEKSAVYLT 



49436 
29 

49520 



AGVHNERESQDK Viv©i«vv^~"~- ~ 
gtcgatagttcaggagcttggctaggaagtgctccgggagccaaggaaagtcctctctacaatgaaaagggacagcatgtagga 

S7 VDSSGAWLGSAPGAKE S PLYN EKGQHVG 

49604 aaattgaaagaggtgggagagtga 49627 

85 KLKEVGE* 

21259 gacccctttctatgctcgacttttcgagtgttttga 21224 

57 DPFLCSTFRVF* 

dplORF105 cacc agttcgaatgaa aatagtcttttgacctataaccattccttcaccttgaattgtaggaccgaaaat 

1 M I VASTSSNENSLLTYNHS FTLNCRTEN 

1944 ttccatgataggcattttctcagggtcgcgaacattgattcgaatcttgcctctttcaggctgattgtattgattaaccattat 

29 F H D R H FLRVANIDSNLASFRLIVL INHY 

I860 cctgctcctgctctaaaatttcgcggacagtaa 1828 

57 PAPALKFRGQ* 

nrrrrT rrrrrrr rrrrrrrrrrirrfT 

„445 "ccccattatt^ 

10361 ggttttaccagttccagcgccaccacagaatag 10329 
57 GFTSSSATTE* 

dplORFl07 ttt cgtttattgggaaactcgwaatggaggaatgcgtgacagtatcacaaggcccgaaaaagtccttg 

10750 /g 9 ^^^ prllGNLQMEECVTVSQGSKKSL 

10834 attatagtcatcacgttgacatggaagccgtttctaatgcactag 10878 
29 iiviTLTWKPFLMH* 

Jl^f 108 atgcactcctgcacaataggacaccgcgcagcaaatacaaagaaagacaatctgccaaagaaaaatagttgtgatgttactata 
1 mhSCTIGHRAANTKKDNLPKKNSCDVTI 
49363 tctatgattcaatttcgcttacctccaatcctcttacattgcttgcctgaaaatctagaaccactgaagtatcatatatacgac 
29 smIQFRLPPILLHCLPENLEPLKYHIYD 
49279 tataaagcctttggcctaaaaggccaataa 49250 
57 YKAFGLKGQ* 

3?632^ 10S atgtggttgtcgaagtcccaaatagttgattctccttcaactttccagcctttgaaagccttacctgttaaggtagggtcaac^ 
1 mmlSKSQIVDSPSTFQPLKALPVKVGST 

ggttttggagaaatcttcttacctgcttcaactcgaactgcgtcggcggttcctgttccaccgttcaaatcgaatgtcacgcga 
gfg eiflpastrtasavpvppfksnvtr 



31548 

29 ~ - ~ - - 

31464 cgaagaaccgctggaagttgtgccacatag 31435 

57 RRTAGSCAT * 



16444 atgatttcaattctagcatcaacttccatgtcgcgagtaagtgtgactccagtttcagcgacaggacatgctttgaatactgca 

1 M ISILASTSMSRVSVTPVSATGHALN T A 

16528 atgtcaagttcgctctttctaataactgagcctaggtctaagtacaagttaggattgattccagcgaccttatattgtttctra 

29 m ss s lfliteprskyklgli PVTLYCFS 



:ca 



16612 gtttcttttacaggaatgctttcatag 16638 
57 VSFTGMLS * 



28657 gtgactctatcaagaaagctcttgcaattggtgttcaaggtccttgggaaaacttcttgcttcttgcaagtgacgctgagaaat 

i vtlsrkllqlvfkvlgktscflqvtlrn 

28741 tcatcgctgaaaaaacaggtcttcaaatcgctgtctactctaagaaaattgctcagttcgctgacgctgacaaacttcct^acg 

29 SS LKKQVFKSLSTLRKL LSSLTLTN^P- LT 



28825 ctggtaacattcgtcagttcaacctga 28851 
57 LVTFVSST* 



tf207 F112 atgcaaactgatttaggcaaatactgcttcgacgcagcagccgttgcttatattagatatttgcaggaagacaagactcctagg 
1 M QTDLG KY C FDAAAVAY I RY LQEDKT PR 

32291 tatcccggtgacgaaaagaaaaacccaggattgcaaatgcttatggagtga 32341 
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29 YPGDEKKNPGLQML.ME* 
dplORP113 

17715 atgaaaacagttaaagaagcaatcaaacaattcggtgatgaatggtggtacgaaattatcaacgaaaacggccaaatgattcaa 

1 MKTVKEAIKQFGDEWWYEI INENGQMIQ 

17631 gacggaagaatcgaagacatgggcgaatacatggaagaaacggtcgaccaagttaagttcatcaactatggtgacatcgaatct 

29 DGRIEDMGEYMEETVDQVKFINYGDIES 

17547 caaattatcaaactatatatcgcataa 17521 

57 QI IKLYIA* 
dplORF114 

52952 atgctattggcgaagacggggaaacagtccatcctgataattgtccattatgccaaaacggattccctcgtattgaaaaactat 

1 MliLAKTGKQSIL I I VHYAKTDS LVLKNY 

53036 ttcttcaactttacaaccatgatacgggaaaagttgaaacatgggaccgaggccgttcttatgttcaaaagattgttacattta 

29 ppNFTTMIREKLKHGTEAVLMFKRLLHL 

53120 tcaataaatatggaagccttgtga 53143 

57 SINMEAL* 
dplORFUS 

5342 atgagcctcctttttttgatatatataatatacacgaattatcgcgagtttgtaaagccgtttctaaataattttaaatctttt 

1 MSLLFLIYI IYTNYREFVKPFLNNFKSF 

5258 aagcatattgagttttgcttcataagtcccgttcacggcagcctcttgcattttgagtacaatgaaaggaggttcctcgatatt 

29 KHIEFCFISPVHGSLLHFEYNERRFLDI 

5174 gttgaaactatagaaggtgaataa 5151 

57 VETIEGE* 
dplORFH6 

20662 atgaaattttcaaactttgctaaagcacttactaatgaatacctaatggtagtgaacaatgaccaagctgaagtcttaggcgca 

1 MKFSNFAKALTNEYLMVVNNDQAEVLGA 

20578 ggaaatatcgaaaacattctcaacggttcgaactttgctaatgttgtagctgaagcgacagttttaaaactcgaaaaactcagc 

29 GNIENILNGSNFANVVAEATVLKLEKLS 

20494 gaagaggaagctattgagtag 20474 

57 E E E A I E * 
dplORF117 

24680 atgataacaggctgctcgaacattttaaatcgaagtgaatctcgtaagtcactaatagttttgttcaagttatctgctactgtg 

1 MITGCSNILNRSESRKSLIVLFKLSATV 

24596 ataaggtctttgacatcgcttgtcccgtatatgtcattagtcaatggttcattaagaataactcgacaaggaatttgcttcaag 

29 IRSLTSLVPYMSLVNGSLRITRQGICFK 

24512 ccggttggggcggattcttga 24492 

57 PVGADS* 
dplORFUS 

15023 atgatattatctacgtcgacgcaacttgtgaaactattaaatacgaggagcctattgcatgaacaatcagcgaaagcaaatgaa 

1 MILSTSTQLVKLLNTRSLLHEQSAKANE 

15107 caaacgaatcgtcgaacttcgcgaagactatcaacgtgcaagaggtcgaataaacttccttcttgctgtaaaggaccacggcga 

29 QTNRRTSRRLSTCKRSNKLPSCCKGPRR 

15191 agaactcgaaaaccttga 15208 

57 R T R K P * 
dplORF119 

41054 atggaggttcaacatccccgattcagtacgtcctactttttcgggcatttctttagtagacacgacttcagcggttcgacagat 

1 MEVQHPRFSTSYFFGHFFSRHDFSGSTD 

41138 tttaacagggaacaacttcctccaaatcatgtcgaacattcaagtcaacttcaacaatgcttccggcgcttacggatccactat 

29 FNREQLPPNHVEHSSQLQQCFRRLRIHY 

41222 ccaagcatttcacgctga 41239 

57 P S I S R * 
dplORF120 

28387 gtgttgaagcgcaagcagaatacatgcgtatgcaattgcttcaatacggtaaattcactgtcaaatcaactaacagcgaggctc 

1 VLKRKQNTCVCNCFNTVNSLS NQLTARL 

28471 aatacacttacgaccacaacatggatgctaagcaacaatatgcagtcactaagaaatggactaacccagctgaaagtgacccta 

29 NT LTTTTWMLSNNMQSLRNGLTQLKVTI* 

28555 tcgctgacattttag 28569 

57 S L T F * 
dplORF121 

39222 gtgcagacggatcacgtgagttcagtt tggaagataataatcaacaatatatgggt tat tact ccgattatgagcaagcagata 

1 VQTDHVSSVWKIIINNIWVITPIMSKQI 

39306 gcagggatcgaactaagtatcgatggtttgaccgccttgccaatgttcaagtgggaggtcgaaacgagttccttaattctttat 

29 AGIELSIDGLTALPMFKWEVETSSLILY 

39390 ttgaatttggtttaa 39404 

57 L N L V * 
dplORF122 

40402 atgttattctccttatcctacataccgaatcacgttcatgtctggattaaacgagtattgttccgttctaaatcggccgacttg 

1 ML FS L S Y I PNHVHVWIKRVLFRSKSADL 

40318 aatggattgggtaaagatcccgttatcgatgtgaatgaacccttgcgtaaggtacataacttcattccctgcggagaacataga 

29 NGLGKDPVIDVNEPLRK VHNFIP-C G__E~H R 

40234 aattcggtcacttga 40220 — 

57 N S V T * 



dplORF123 

21327 atggttcgacttctcgaaggattgaggttttcgaaccggttgagtttttcgagcattctcgacttttcgacccctttctatgct 



WO 00/32825 



PCT/IB99/02040 



387 

1 MVRLFEGLRFSNRLSFSS I LDFSTPFYA 

21243 cgacttttcgagtgttttgaggttttcgagcaggttcgacttttcgagaaattgagtttttcgacctctaaattaggctcgatt 

29 RLFECFEVFEQVRLFEKLSFSTSKLGSI 

21159 attcgaaaagtttag 21145 

57 I R K V * 
dplORF124 

17891 atggtaaaagttaaagatttgcaagtaggaatgaaagttgtaaatgcaaaaggtactgaatttaaagtaactgaccgtcaaggt 

1 MVKVKDLQVGMKVVNAKGTEFKVTDRQG 

17807 cgtaaatgggtaagcctagaacgtcttagtgatggacgtattcggttctatgataacgaatcactaatggacgaaaaagtggag 

29 RKWVSLERLSDGRIRFYDNES LMDEKVE 

17723 gtagtaaaatga 17712 

57 V V K * 
dplORP125 

49916 atgtcctcagccgcttccgttaaaattggaacaagtgaattatatagatgctcctcttttagcttgtcgataaggtattcatca 

1 MSSAASVKIGTSELYRCSSFSLSIRYSS 

49832 gtttcgccaatttcgaaaaattcgaatccaggaaaacggtcgagaatagtttcgtcgtccggaactcttccatatctcgaaaag 

29 VSPISKKSNPGKWSRIVSSSGTLPYLEK 

49748 tgttcttga 49740 

57 C S * 
dplORFX26 

16136 atgagctcaagtacgttttctcgaacaatagggtcaagtccagttatatcaacgaactgtatatcgtcctcttgtataggaata 

1 MSSSTFSRTIGSSPVISTNCISSSCIGI 

16052 aggtctgcgtacagttgcatggctgaccctttaattggagtaactgttccttcactgtttattttaaataaggttatcatttct 

29 RSAYSCMADPLIGVTVPSLFILNKVIIS 

15968 atcctctaa 15960 

57 I L * 
dplORF127 

13511 atgctaaatagctttcccattcaccgtcgctgttcttgcgccatttttcagtttcacgataccgaccaactttgcaaaggtcgt 

1 MLNSFPIHRRCSCAI FQFHDTDQLCKGR 

13427 gaaatagtgctacgattgcaactgtttccattgggtaaatgtcttcccagcctttgcctaccatggtatccatttcgaaaagta 

29 EIVLRLQLFPLGKCLPSLCLPWYPFRKV 

13343 gttgattga 13335 

57 V D * 
dplORF128 

4852 atgacagcagttcaacaagttaagttctacttagaagaagccggcgctcactttctaaaagatgttgagtacagtgacaactta 

1 MTAVQQVKFYLEEAGAHFLKDVEYSDNL 

4936 gagcaagcaattatgaaagatattcttaaatggaatggcgctcatagagatgagcacgatatgaaaataacttcatacgaagta 

29 E Q A I MKDILKWNGAHRDEHDMK ZTSYE V 

5020 ttatag 5025 

57 L * 
dplORF129 

25133 atgaactttctgctaagcaacttgcgctcactgaagttcaaactaatgtacgcagccaccaatcttacattgaagaattcagta 

1 MNFLLSNLRSLKFKLMYAATNLTLKNSV 

25217 agaaggaaaaggcggacaaggaatgggaacgcattttggaagaacttgctcagcttgacgaaacctcagctggagcattgcctg 

29 RRKRRTRNGNAFWKNLLSLTKSQLEHCI* 

25301 tattag 25306 

57 Y * 
dplORF130 

16789 gtgcttgactttattcctttattatcgtataatcataatataaataaaacaagcgtcaaggacgcagaaagaggtcaattatgg 

1 VLDFIPLLSYNHNINKTSVKDAERGQLW 

16705 aaacaacactttacttcggttatcttacagcagattggaaagacggtcacaagaactacactttccactatgaaagcattcctg 

29 KQHFISVILQQIGKTVTRTTLSTMKAFL 

16621 taa 16619 

57 * 
dplORF131 

43846 atgctcaaccggctgagaagaaacttggctggcagaaagatgctactggtttctggtacgctcgagcaaacggaacttatccaa 

1 MLNRLRRNLAGRKMLLVSGTLEQTELIQ 

43930 aagatgagttcgagtatatcgaagaaaacaagtcttggttctactttgacgaccaaggctacatgctcgctgagaaatggttga 
44013 

29 KMSSSISKKTSLGSTLTTKATCSLRNG* 
dplORF132 

15304 gtgactggaaggtcatctaatacacatagcctcaagacatttcgttggctttcaggaaaacattcgactagattgtcaatgtat 

1 VTGRSSNTHSLKTFRWLSGKHSTRLSMY 

15220 cccacaaaggcttcaaggttttcgagttcttcgccgtggtcctttacagcaagaaggaagtttattcgacctcttgcacgttga 
15137 

29 PTKASRFSSSSPWSFTARRKFIRPLAR* 
dplORF133 

8061 atgacttcttcattcatgacaagttttcgagtttctgcttgcttgtcaggaatagttttcccggcggctaaaatgtatagatca 

1 MTS S FMTSFRVSACLSGI V F P A A ~ K M Y_- "R L 

7977 tcgtatttttctttcctgatagcagaacttgaatccatttgtattcccaccatttccgccctatctgcggcgaaataa 7900 

29 SYFSFLIAELES1CIPTISALSAAK* 
dplORF134 

498 atgacttcaatgtacttaggttccatcaattcatacaagtcattcaaaataatgttcatgcaaccttcgtggaagtcaccgtgg 

1 MTSMYLGSINSYKS FKIMFMQSSWKSPW 

414 ttacggaaactgaataagtacaatttcaatgatttagattcaaccatcttttcgtttggaatgtaa 349 
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29 LRKLNKYNFNDLDSTIFSFGM * 

dplORF135 

780 atgaagcagaacttgaaaatgctgctaatgttgcaatgttctacggagtcaagttcaccattcttgaaattgactcgaaaatct 
1 MKQKLKMLLMLQCSTESSS P FLKLTRKS 

864 actcaagctctagctcttccttattacaaggaaaaggcgaaatttcacatggaaaatcttacgctgaaatcctag 938 

29 TQALALPYYKEKAKFHMENLTLKS* 
dplORF136 

55252 gtgaagaaatcttcaataaccttattcgcttctttgacagatacattcatctgctcagcgattgagttagccccgcggccgtac 
1 VKKSSITLFASLTDTFICSA I E L A P R P Y 

55168 ataagacctaaaagaacggacttgacagaatttcttcgaagttttccttccttgttagtcgttccgtcgggatag 55094 
29 IRPKRTDLTEFLRSFPSLLVVPSG* 
dplORF137 

37146 atgcttcgaacttgtttgttagcaccgtcaggaggacaaactagtcgaacccattcacctgcgtctttgataatatctagcgcg 
1 MLRTCLLAPSGGQTSRTHSPASLIISSA 
37062 acagcgcctacagaagaagcaacgtgtttcaacttcctaggcaagccttctgctagttcataccataatgcgtag 36988 
29 TAPTEEATCFNFLGKPSASSYHNA* 
dplORF138 

30662 atgactatatcgaagaacaatgtagtcatccggcctatctgtatcttgctcgtcaaattcaactcctggaagcataggagcagg 
1 MTISKNNVVIRPICILLVKFNSWKHRSR 
30578 cgagagctgaaatgtaggaagaatttcctccaatctgtccatcattgtcgttcgtttagtcatgttcactcctag 30504 
29 RELKCRKNFLQSVHHCRSFSHVHS* 
dplORFl39 

12092 atgatactaaatcactcaacttgtttgaccctcctgataaattcgttcacgcagacacgcgcatttgagccctttttagatacc 
1 MILNHSTCLTLLINSFTQTRAFEPFLDT 
12008 tttcgcaaacacctagatgcttccctcactaaaaggtcatgggcctcaagttcttcgaaagacatttctacatag 11934 
29 FRKHLDASLTKRSWASSSSKDIST* 
dplORF140 

20562 atgttttcgatatttcctgcgcctaagacttcagcttggtcattgttcactaccattaggtattcattagtaagtgctttagca 
1 MFSI FPAPKTSAWSLFTTIRYSLV5ALA 

20646 aagtttgaaaatttcattttattttccctttatttgtttttctttatactattattatacaataatgattga 20717 
29 KFENFI LFSLYLFFFILLLYNND* 

dplORF141 

42922 gtgctaagagttgtagagatatcctctaaaacgctcttggctttattcgatttccattcgaataacttatttagtaggacagta 
1 VLRVVEISSKTLLALFDFHSNNLFSRTV 
42838 agcactccgctgcacgctgtaataatcgtcgtcaagactgctgtgtcgtttagccacattggcatagattga 42767 
29 STPLHAVIIVVKTAVSFSHIGID* 
dplORFX42 

31898 gtgactgtcgaagtttctccaaacagttctgtcactttacctaaaagcgtattagggatttteccgttagcgattaggttcatg 
1 VTVEVSPNSSVTL.PKSVLGIFPLAIRFM 
31814 acacctgctgctcgaattttaacatggataggttcactaccttttgaaaatcctggaagtgcgatgatttga 31743 
29 TPAARIliTWIGSLPFENPGSAMI* 
dplORF143 

7565 atgaagtttgggttgacgcttttaactccagaccgtttaattttttcaaggcttgaaattggataccatataaccttttcatgc 
1 MKFGLTLLTPDRLIFSRL E I G Y H I IFSC 

7481 ttttggaaatacactaaaattccggcgagaataaatttgcatccatctgcgcgtgatagctggaaccatcga 7410 

29 FWKYTKI P AR INLHPSARDSWNH* 

dplORF144 

36517 gtgcaaatcaagcgactaacttatttagatacattaaacgaggcgcattcttcaagattcctaatggaaattcaacaattacca 
1 VQ1KRLTYLDTLNEAHSSRFLMEIQQLP 
36601 ttgaataccgagccgatgacgcagcagcttggacctctactcctcccgctcaagttgaactgtttctaa 36669 
29 LNTEPMTQQLGPLLFPLKLNCF* 
dplORF145 

42067 atggaaacagctggagacctaacaagtggaaagaggttctatttaagcaagacttcgaacagaataattggcagaaacttgttc 
1 METAGDLTSGKRFYLSKTSNRI IGRNLF 

42151 ttcaaagtgggtggaaccatcactcaacctatggcgacgcattctattcgaaaactcttgacggcatag 42219 

29 FKVGGTITQPMATHSIRKLLTA* 
dplORF146 

51484 atgacaaactgcatgattgcatcacctttccagtacggaacctcaagggcgaaacagtattcttcaaccgtcgaagtgttcgtt 

1 MTNCMIAS PFQYGTSRAKQY S STVEVFV 

51568 ctaagtttcaccagcacggtgaagatgaccctaaaacggaatttctttatggccaatatgagcttgtag 51636 

29 LSFTSTVKMTLKRNFFMANM S L * 
dplORF147 

55207 atgtatctgtcaaagaagcgaataaggtcattgaagatttcttcaccgagttccctaaagtggcagactatatcatattcgttc 

1 MYLSKKRIRLLKISSPSSLKWQTISYSF 

55291 aacagcaggcgcaggacttgggatatgttcaaacagctaccggtcgaagaagaaggcttcctgatatga 55359 

29 NSRRRTWDMFKQLPVEEEGFLI * 
dplORF148 

28636 gtgtttcggttcaagaccattcgagtagggcgaacacctgtacgattttcgatgtcatccattgctgctaaaatgtcagcgata 

1 VFRPKTIRVGRTPVRFSMSS I A A ~ K M — S." "X I 

28552 gggtcactttcagctgggttagtccatttcttagtgactgcatattgttgcttagcatccatgttgtag 26*84 

29 GSLSAGLVHFLVTAYCCLASML* 
dplORF149 

26474 atgccactgaacttttcgagcacaaggactaaccttgccccattgtctcactccagctgtggcggaatggctaatggtagttcg 

1 MPLNFSSIRINLAPLSHSSCGGMANGSS 

26390 agcaagtcgaagggcattgtattcgagattttgatatttatgagcagcaggtttccctag 26331 
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29 SKSKGIVFEILIFM-SSRFP* 
dplORFlSO 

15185 gtggtcctttacagcaagaaggaagtttattcgacctcttgcacgttgatagtcttcgcgaagttcgacgattcgtttgttcat 

1 VVLYSKKEVYSTSCTLIVFAKFDDSFVH 

15101 ttgctttcgctgattgttcatgcaataggcccctcgtatttaatagtttcacaagttgcgtcgacgtag 15033 

29 LLSLIVHAIGSSYLIVSQVAST* 
dplORF151 

28027 atgattatatcaacgcaggggagattgctagctacattcaagcacttccttcaaacgctcttcaataccttggaccaactcttc 

1 MIISTQGRLLATFKHFLQTLFNTLDQLF 

28111 tccctaatgctcaacaaacagggacagacatttcatggctcaagggtgcaaataatttgccagtaa 28176 

29 SLMLNKQGQTFHGSRVQIICQ* 
dplORF152 

42235 atgtgcataaaggacttatcgacaaagaggctactattgcagtacttcctgaaggatttagaccgaaagtttcaatgtatcttc 

1 MCI KDLSTKRLLLQYFLKDLDRKFQCIF 

42319 aggctctcaataactcatatggaaatgccattctatgtatatacactgacggaagacttgtggtga 42384 

29 RLS ITHMEMPFYVYTLTEDLW * 
dplORF153 

22307 atggtggacaaagggctcacctttccgaactttcgatatcgtcatagcagacggttccattcgttcaggaaaaacagtatcgat 

1 MVDKGLTFSNFRYRHSRRFHS FRKNSID 

22391 ggctctttcattttccctttgggccatgacggaattcaacggacaaaactttgccatctgtggtaa 22456 

29 GSFI FPLGHDGIQRTK .LCHLW* 
dplORP154 

18446 gtgacaataggctttaagaactgcaaaaaaacctggggcgtctgcacgcgcaacctggagctccttaacagtcatccaaggctg 

1 VTIGFKNCKKTWGVCTRNLELLNSHPRL 

18530 aggtttcttacaaacaatcctaattccttcaaaatagctcttgtccgggtcaatagtgcctaa 18592 

29 RFLTNNPNSFKIALVRVNSA* 
dplORF155 

13512 atgaatacgaccctgagcaacttacaatgggacatggtgcaaaatctaatttccttcttcaacgtttcattcaactcacgccag 

1 MNTTLSNLQWDMVQNLISF FNVSFNSRQ 

13596 ttgaagctcaagcaattttctggcatatgggagcctatgatattagtccttatgcaaatttga 13658 

29 LKLKQFSGIWEPMILVLMQI* 
dplORF156 

18777 atgctagtatctccatttctgttggccttgctttttagctctgttcagttcagctgcttctcgcgatgcaatagtttcgagaat 

1 MLVS PFLLVLLFSSVQFSCFSRCNSFEN 

18861 atgcctgttcataggctcacaatattccgccaaagatttgccagttatggtggcgtcaattaa 18923 

29 MPVHRLTI FRQRFASYGGVN * 
dplORFl57 

13281 gtgcttgctggacttgagaagaaattggtatcattttcgagccaatccataaggttctcgataccgtcacgattgattgtttct 

1 VLAGLEKKLVSFSSQSIRFSIPSRLIVS 

13197 gttactgctttcttgaagcgttttttaaagtctgtcatattagacccctttcattttctataa 13135 

29 VTAFLKRFLKSVILDPFHFL* 
dplORF158 

40727 gtgaacgccgttattagggtcaaacgaagcccaaacggacattgtctttgtcccgtcactattgtgaggaacagtcacttctcc 

1 VNAVIRVKRSPNG HCLCPVTIVRNSHFS 

40643 acttgcgagcgttacctcttcgccggacgtgtcgtagtctgggtgactgctatgaacacttga 40581 

29 TCERYLFAGRVVVWVTAMNT* 
dplORF159 

30371 atgatttggtctgcgcttacccaagcagcttctcctttgagtttctgtcgagcattccctgtacggtctgtccaaatagcatgc 

1 MIWSALTQAASPLSFCRAFPVRSVQIAC 

30287 gtctttgcgtattcttccatcttagtagcagcgacttcgcagactgttatgacagcgacttga 30225 

29 VFAYSS I LVAATSQTVMTAT * 
dplORF160 

41324 atgggttacagacacgcgaggaaaacaatcgaacgtccaagacgtatctatcaatgttatagaatactatggaccgtctatcaa 

1 MGYRHARKTIERPRRIYQCYRIIjWTVYQ 

41408 tttctccgttcaacgtactcgtcaaaatcctgcaattatccaagctcttcgaaatgctaa 41467 

29 FLRSTYSSKSCNYPSSSKC* 
dplORF161 

52175 atgcaaaaaggtttaaatgcttatctcgacatgacattgaaagcattgcattcgagactatttcaaaatgtttggcaacgttca 

1 MQKG LNAYLDMTLKALHSRL FQNVWQRS 

52259 aatcaaaccaaggggccaagttttcaacttaccttacaagactcttcaagaatagaatag 52318 

29 NQTKGPSFQLTLQDSSRIE* 
dplORF162 

13020 atgacagaagttgcggtaaatagcccgcaaaaggtgagagtagttatggtcgggaatattgaatttctcgaatatttaaaaagg 

1 MTEVAVNS PQKVRVVMVGNI EFLEYLKR 

13104 aagtacggaacagaaacttccatcagttatattatagaaaatgaaaggggtctaatatga 13163 

29 KYGTETSI SYIIENERGLI * 
dplORF163 

40224 gtgaccgaatttctatgttctccgcagggaatgaagttatgtaccttacgcaagggttcattcacatcgataacgggatcttta 

1 VTEFLCSPQGMKLCTLRKGS F T S I T G S _L 

40308 cccaatccattcaagtcggccgatttagaacggaacaatactcgtttaatccagacatga 40367 - 

29 PNPFKSADLERNNTRLIQT* 
dplORF164 

6696 atgtactcttggagaacctcgtgcctaaatgttccagcttcgcccattgcaattaggttagaatctgcgttatctataatagac 

1 MYSWRTSCLNVPASPIAIRLESALSI1D 

6612 tcaccgattctttcgaaatacattcttcgaatacatccaccaaccccgctgggcttataa 6553 

29 SPI LSKYI FRIHPPTPLGL* 
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dplORF165 

50504 atgagtgaaagctggtcaatccccaccacagatggcctatatttagatatcatgctatctaaaattgcaggggtaaggttcttt 

1 MSESWSIPTTDGLYLDIMLSKIAGVRPF 

50420 cctccaatcataaagggcgtgactaccacaagggaattttcagcctcagtcattgcttga 50361 

29 PPIIKGVTTTREFSASVIA* 
dplORF166 

23519 gtggtcatgctctttaatgactctatcttctcccgcttggctcgctttactgtcccagctgtaagcatagtattcatcaatgtc 

1 VVMLFNDSI FSRLARFTV PAVS IVFINV 

23435 gtgcgtgttgctagggtcgagtgtaaatctattctcagccaagagttcagcgtgaaatga 23376 

29 VRVARVECKSILSQEFSVK* 
dplORF167 

1008 atgcttattcggttggagcttcttacgtcgtatatggtgctcacgcagacgatgcggctggaggtgcttaccctgattgcactc 

1 MLIRLELLTSYMVLTQTMRLEVLTLIAL 

1092 ctgagttctataattcaatgtcaaatgcaatggaatatggaactggaggcaaggtaa 1148 

29 LSSIIQCQMQWNMELEAR* 
dplORF168 

54345 atgagactttttccaggttatattcttcacattgttcagttcctggagtcaagtattgttcttgaaattcatagagttcgaaag 

1 MRLFPGYILHIVQFLESSIVLEIHRVRK 

54261 tttgcaaagggtcataggccgcatacatataggcaacatcaggaggaattaaactaa 54205 

29 FAKGHRPHTYRQHQEELN* 
dplOR7169 

45954 atgaacacagcatcgcgaagagtttcaatgttagtgataaggaagaattcgtcgtggccaccaagcaagtcttctgcccgttta 

1 MNTAS RRVSMLVI R KNS SWPP S KSSARL 

45870 gaaactccgtcaatcactaatttcccatctttagtgactcgacttcctaaaatatga 45614 

29 ETPSITNFPSLVTRLPKI* 
dplORF170 

27600 atgatgattgttcttgtgctcctgccgtttgttgagcagcagcaagttgcttaccaaaagagccgatttcacgaggttcgggaa 

1 MMIVLVLLPFVEQQQVAYQKSRFHEVRE 

27516 caccaccaccgacacgacctggatttcctaaatttccagtcccggctggcgacttag 27460 

29 HHHRHDLDFLNFQSRLAT* 
dplORF171 

47678 atgtcattttctttcatgtactcttttagagcatcacgaagacttttgacttgtttctccatgtcgcctttggtagcatttaat 

1 MSF5FMYSFRASRRLLTCFSMS PLVAFN 

47594 tcaccggcttcttcaattgcagcgatgaactgtttttcatcttcaaatttcatttaa 47538 

29 SPASSIAAMNCFSSSNFI* 



dplORF172 

10462 atgtttcgaacattttctaccccattattagaagcagcatcaatttcaataggagagccaagtcctttgttcacatccttcgcg 

1 MFRTFSTPLLEAAS I SIGEPS PLFTSFA 

10378 aaaattcgagcagtagtggttttaccagttccagcgccaccacagaatagatag 10325 

29 KIRAVVVLPVPAPPQNR* 
dplORF173 

32160 atgacattagacatttccttcgtctgtacgaaaggtttcagcttgagtcacttcaccgtacattgcactgaagattgtcataag 

1 MTLDISFVCTKGFSLSHFTVHCTEDCHK 

32076 ttgctcatctgtcatatactcgccgacttcagcgcaagtaggctctaccattga 32023 

29 LLI CHILADFSVSRLYH* 
dplORF174 

29766 atgtcccatcagcccttttcattaagattgtcgaaccagcgttcgacttttcatcagtttcaagctgttcttgcttatattggt 

1 MSHQPFSLRLSNQRSTFHQFQAVLAYIG 

29682 cataatagaattgcgccatttgtttccagtagtctgcgtcaccttttagactga 29629 

29 HNRIAPFVSSSLRHLLD* 
dplORF175 

15648 atgcgcgcgatgtcatggcagataggcgaggataaagagtgtcgaatagaacgccgcagagcttacgagagcgccaaatacaag 

1 MRVMSWQ IGEDKECRI ERRRAYESAKYK 

15564 ggcgacggtactacggtggtcctcttgcttacctgtaaccaaataaaccattga 15511 

29 GDGTTVVLLLTCNQI NH * 
dplORF176 

43031 gtgataaagacggtaacgttgaattttcctagttccgtcttgaatgacgtcattttggtgattgattgctactgtcgtttggtc 

1 VIKTVTLNFSSSVLNDVILVIDCYCRLV 

42947 aatcccgtcgacctgctgtttaagagtgctaagagctgtagagatatcctctaa 42894 

29 NPVDLLFKSAKSCRDIL* 
dplORF177 

19937 atgaacctaaacagttcgagacttctcaagctgttgggaaagaagcaggtcgaatattttggtgggaacgtgaacttggtcata 

1 MNLNSSRLLKLLGKKQVEYFGGNVNLVI 

19853 ttctcgcgactaattttaggtgcttttgtattaatcagcgtgatatgcgcttga 19800 

29 FSRLILGAFVLISVICA* 
dplORF178 

11924 atgacaactgtcgaccaatttaaaagacagttgaggaaaagtttaggctcaatttttccttcatcagtttccttaaatttgagc 

1 MTTVDQFKRQLRKSLGSIFPSSV-S L N -L S 

11840 caattagtaacctttagcgaattgctagcacttgccccccatattaagtcataa 11787 - 

29 QLVTFSELLALASHIKS* 

dplORF179 

56058 atgggtagggttattccttacctcgtcgatttgctttatgcaaaacctaccacaatcgcttgtcgtggcttcaggagttgcatt 

1 MGRVIPYLVDLLYAKPTTIACRGFRSCI 

56142 ttggataagtcaaaaagcaagtgtctttatattcgacaagctctcgaataa 56192 
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29 LDKSKSKCLYIRQALE* 
dplORFl80 

41176 atgttcgacatgatttggaggaagttgttccctgttaaaatctgtcgaaccgctgaagtcgtgtctactaaagaaatgcccgaa 
1 MFDMIWRKLFPVKICRTAEVVSTKEMPE 
41092 aaagtaggacgtactgaatcggggatgttgaacctccatccgtttgaatag 41042 
29 KVGRTESGMLNLHPFE* 
dplORFlBl 

13126 atggaagtttctgttccgtacttcctttttaaatattcgagaaattcaatattcccgaccataactactctcaccttttgcggg 
1 MEVSVPYPLFKYSRNSI FPTI TTLTFCG 

13042 ctatttaccgcaacttctgtcataggctgtcctcctttgcttatactgtaa 12992 
29 LPTATSVIGCPPLLIL* 
dplORF182 

45369 gtgcttgcccatgtttcaataaatagggttcgacctcgcctagctttcgaacgtgctataacgatttcaatcatagcgaagaaa 
1 VLAHVSINRVRPRLAFERAITISIIAKK 
45285 ggtgagaagcttcaatcaattccattgcggtgtcaatatcttcttccttga 45235 
29 GEKLQSIPLRCQYLLP* 
dp 10 RF 18 3 

13896 gtgattccagcttttggtttttcttcagcctcttcaactttttcttccttaggcgcaggtttcttacgagttgaactcttaggt 
1 viPAFGFSSASSTFSSLGAGFLRVBLLG 
13812 ttttcttcaactacttcttcaacctcagcctcttgttcaactggaccttga 13762 
29 FSSTTSSTSASCSTGP* 
dplORFl84 

53330 gtgaacttgccgtcaaccacgtcaaacatttggtcttcgtcgaggtctaaaattagagttccaagaagttcgctcttttctgga 
1 v nL PSTTSNIWSSSRSKIRVPRSSLFSG 

53246 aaatcttcaagagtagcactgtcttccggacgctctggaaggaattcataa 53196 
29 KSSRVALSSGRSGRNS* 

dplORF185 

22522 atgaaattcgagatgttcgaaatgaaaatctacttattattagacactttagaaatggcgaagaaattgtcaactacttctata 

1 MKFEMFEMKIYLLLDTLEMAKKLSTTSI 

22606 tatttggaggaaaagatgagtcgagtcaagaccttatacagggggtaa 22653 

29 YLEEKMSRVKTLYRG* 

^l^^ 186 atgctcgaaaaactcaaccggttcgaaaacctcaatccttcgaaaagtcgaaccattcgaaaagttcaaaagttcgaaaaactc 

1 MLEKLNRFENLNPSKSRTIRKVQKFEKL 

21356 aaccattcgagagtaggaattaaggacataccagttcaacctttttag 21403 

29 NHSRVGIKDIPVQPF* 
dplORF187 

34415 atQgtcttgttcaatctcttcctactatcattcaagcagctgttcaaattatcactgctttattcaatggtcttgttcaggcac 

1 M VL FNLFLLSFKQLFKLSLLYSMV LFRH 

34499 ttcctacgcttattcaagcaggtcttcaaattttgtcagctctcataa 34546 

29 FLRLFKQVFKFCQLS* 
dplORF188 

35609 atgttcgtaaagcagccggttcgcctcgagtggacttgttcaatacaggaagtgacaaccctaaccaacctcagtcacaatcta 

1 MFVKQPVRLEWTCSIQEVTTLTNLSHNL 

35693 aaaacaatcaaggcgagcaaaccgttgtcaacattggaacaatcgtag 35740 

29 KTIKASKPLSTLBQS* 
dplORF189 

42587 atgcaaacgcagtatcaaccgtctctgaaactcttcatgacccagacttgtatgctgcgaaccgtcgagaacttcgagctgacg 

1 MQTQYQPSLKLFMTQTCMLRTVENFELT 

42671 agcaaaaacttcgcgaaactcgttacgcaatcgaagatgaaattctag 42718 

29 SKNFAKLVT QSKMKF* 
dplORF190 

39786 atgtattcactcaaagttgttcagtgtggctcaatcatattaaaatcgaacttggtaatatctctactccttttagtgaagcag 

1 mySLKVVQCGSIILKSNLVISLLLLVKQ 

39870 aggaagaccttaaatatcgaattgactcaaaagccgatcaaaagctaa 39917 

29 RKTLNIELTQKPIKS* 
dplORF191 

40996 atgtccattgttccggaacttgatttaggtaagtaccttgctaagtccagtgacggcgtaaaggatacgctagtagtatggttc 

i msivpeldlgkylakssdgvkdtlvvwf 

40912 ttacctaaatctatccagtcgctaccgaaaactcggtaccaaacttga 40865 

29 LPKSIQSLPKTRYQT* 
dplORF192 

2920 atggtcgacgtcgaatgttttctcgagatgaagtttagggtcttctcgataccctacggtatgttcagcgagtgctttaacaaa 

i mvdvbcffemkfrvfsipygmfsecfnk 

2836 acggaatggagtatcttgcaacccgtcacgttctgcgtcctcgcctaa 2789 

29 TBWSILQPVTFCVLA* 

dplORF193 -- . • _ 

42456 atgatttcagctcaaattaaatacgaaatgagacattgtctaaatttaaccaagaattatctacattcgattt_caccacaagtc 
1 MiSAQIKYEMRHCLNLTKNYLHSLSPQV 
42372 ttccgtcagtgtatatacatagaatggcatttccatatgagttactga 42325 
29 FRQCIYIEWHPHMSY* 
dplORF194 

40284 atgaacccttgcgtaaggtacataacttcattccctgcggagaacatagaaattcggtcacttgataccttaatggtagagcta 
1 MNPCVRYITSFPAENIEIRSLDTLMVEL 
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40200 ccgtcgttcttaccgacaattagaccttcattagaagagctcatgtaa 40153 
29 PSFLPIIRPSLEELM* 
dplORF195 



ZtsT atgttcacaac^tgttttgacaagtttcttttcagcc^^^ 

i M PTI VVLTSFFSAPCPIVNSATIWRDFV 



42500 aggttcaacatagttctcacctcctttctaaaaaatattataacatga 42453 
29 RFNIVLTSFLKNI IT* 

??27f 196 atggtagatttaacaagtccctgtccaatcatgtcactcrt 

1 MVDLTSPCPIMSLLLAHQKKFGFNYRFS 

11189 attagqctcccattcaacaactccagcaagttcattcatttcttctag 11142 

29 I R L PFNNSSKFIHFF* 
dplORF197 tttca ^ 

7484 acgaa ^ yq jqfqalKKLNGLELKASTQTSS 

7568 atgcagggtatgaagtttcttacaagaagcgtcgaactagattga 7612 

29 MQGMKFLTRSVELD* 

TAIY 19 * atgccgctcaacaaattgacgtccagttttattcaatgcctcagttcacctatacagttgaccctagaaacccttccagcttgc 

X M P LMKLTSSFIQCLSSPIQLTLETLPAC 

24203 tttctgttgacattgtttatcaggacgagcgtacaaaaggaatga 24247 

29 PLLTLFIRTSVQKE* 

SET"' gtggctcctgaattaggctgtacttttcctcccaactgcttagcaactgccttctcttgtttagcactagctctgcgcgtg 

1 V APELGCTFPPNCLATAFSCLALALRVG 

15658 atcggtttgcatgcgcgtgatgtcatggcagataggcgaggataa 15614 

29 IGLYARDVMADRRG* 

*8?f 2 °° atgacaggcttgtattcgataagccctgaaag^ 

X mtGLYSISPESFSHISSVSASSTNFSII 
47759 cctttcaagcgttcttcgtccatagttgagcgctctgtcgtgtag 47715 
29 SFKRSSSIVERSVV* 

?sllT 0t atgggcttcacaagttccttctttaatcaaaggtc^ 

1 MGFTSS FFMQRS 1 SLDSNYLDLYRFN YR 

386S3 aacgggctatcaaaaaacctacattccaaaagacgggaatga 38694 

29 NGLSKNLHSKRRE* 

dplORF202 aaattttttacaaaatgcttgaca acatt 

44483 f^l 9Z l ppjjclFYKMLDNlHSLSYNTIIKI 

44567 aataaagccgaaaggcgaggaggacattatgtcaaaaattaa 44608 

29 NKAERRGGHYVKN* 

S^'" f gattaggatt^ 

22697 ttcaggcatcagtgccacctcatcacagaagatacctgctaa 22656 
29 FRHQCHLITEDTC* 

dplORF204 atgaccacggtt ^ agtcaagggatggttgttgacttt tatcacgtc 

1 MTTVRVKGWLI*TFITSRKSQvHSI*T»i* 
1555 acgctgttcttcttcaagggaatgaaccaatcgctttag 1593 

29 TLFFFKGMNQSL* 

tllT* 205 gtgacactgatgaatggttctcagtttggtatgctactcgtgacgcagatatcttctacgaccaaagaattgcccaattta^ 
1 V TL M NG SQFGMLLVTQI SSTTKELPNLE 

8608 ttcaggaaaagcaacctgctatcaagttcaatttcgtag 8646 

29 FRKSNLLSSSIS* 

dplORF206 ^ ttca ^ ttcccaccaaa atattcgacctgcttctttcccaacagcttgagaagtctcgaactgtttagg« 
1 MTKFTFPPKYSTCFFPNSIiRSLELFRF 
19939 aaattgttcaacttgagcaagtgcgatattattctttag 19977 
29 KLFNLSKCDI I t» * 

2?5of 2 ° 7 gtgtcggtggtggtgttcccgaacctcgtgaaatcggc^ 

1 VS vVVFPNLVKSALLVSNIiLLLNKRQfc.H 
27586 aagaacaatcatcattctctaaacaataggaggaactaa 27624 
29 KNNHHSLNN RRN* 



^27f 2 ° 8 atgtttggtatgaagcaaaagacttcgctgaagaaaataacattcacttc^ 

1 M FG MKQKTS LKKI T FTS R L F F LNLEQTL 

47363 accatcgtggttctcgattctgggatgacgaaggcgtga 47401 

29 TI VVliDSGMTKA* 

297ef 2 ° 9 atgttaagaatcaagttcgtagagccattgaaaccgctcccactaaaatcaaggtacttcgaaactcttgggtcagtgatggat 
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1 M L R I KFVEPLKPLL'LKSRY FETLGSVMD 

29868 atggaggaaagaaaaaggataaagcgaatgaagtcgtag 29906 
29 MEERKRIKRMKS * 

dplORF210 

53077 atgtttcaacttttcccgtatcatggttgtaaagttgaagaaacagtttttcaatacgagggaatccgttttggcataatggac 
1 MFQLFPYHGCKVEEIVFQYEGIRFGIMD 
52993 aattatcaggatggactgtttccccgtcttcgccaatag 52955 
29 NYQDGLFPRLRQ* 



dplORF211 

20959 gtgctcgacttttatgtcgcccctaatttttgtttttacttacggactatgggatttgtaggtattttcagggcgcttttttat 
1 VLDFYVAPNFCFYLRTMGFVG IFRALFY 

20875 ttacttattaagtccttttctatattagattgtttataa 20837 
29 LLIKSFSILDCL* 
dplORF212 

52983 atggactgtttccccgtcttcgccaatagcattgcaattgatatagcgtcgacgaccgtcaacgtctgcttcgtggactacgaa 
1 MDCFPVFANSIAIDIASTTVNVCFVDYE 
52899 ataatccatgtcttcgccttccgggtcatcatacaatag 52861 
29 IIHVFAFRVIIQ* 
dplORF213 

30291 atgcgtctttgcgtattcttccatcttagtagcagcgacttcgcagactgttatgacagcgacttgaaacttgtttcgataccg 
1 MRLCVFFHLSSSDFADCYDSDLKLVSIP 
30207 ttcacagttactaacaaattcttcaggcttccatactaa 30169 
29 FTVTNKFFRLPY* 
dplORF214 

24273 atgatgccaaagttgtttttcagtgctcattccttttgtacgctcgtcctgataaacaatgtcaacagaaagcaagctggaagg 
1 MMPRLFFSAHSFCTLVLINNVNRKQAGR 
24189 gtttctagggtcaactgtataggtgaactgaggcattga 24151 
29 VSRVNCIGELRH* 
dplORF215 

35822 atgttaccaaaccctgatagagtttctttacttctattatacaatcctctcgacagtttgtcaacgtcgtcattgtttcgaact 
1 MLPNPDRVSLLLLYNPLDSLSTSSLFRT 
35738 acgattgttccaatgttgacaacggtttgctcgccttga 35700 
29 TIVPMLTTVCSP* 
dplORF216 

32849 atggcctcggagctcgcggccacatctcctccagatacggcagccaggtcaagtacccctggcatagcgtccatgatttcattt 
1 MASELAATSPPDTAARSSTPG IASMISF 

32765 acctggaaaccggctgaagctagattttccataccetga 32727 
29 TWKPAEARFSIP* 
dplORF217 

23443 atgaatactatgcttacagctgggacagtaaagcgagccaaacgggagaagatagagtcattaaagagcatgaccactgcatgg 
1 MNTMLTAGTVKRAKREKIESLKSMTT A W 

23527 ataggaacagatatgcctgtctcactgacgctctaa 23562 
29 IGTDMPVSLTL* 
dplORP218 

22029 atggaatgcttccggaagaggttcgatatagactacaaattgagcgcgagaaaattacattgctccgggccaaaatgggcgacc 
1 MECFRKRFDIDYKLSARKLHCSGPKWAT 
22113 aggaaattgaaggcgaggttaaagataacttcgtag 2214 8 
29 RKLKARLKITS* 
dplORF219 

51388 atgattttatgctcgactttttcagttctcccatttcttcgaaacgcttcagggctgacgccttgcctaactacttcgctagac 
1 KILCSTFSV LPFLRNASGLTPCLTTSLD 

51304 gttccaaaattccttttcagccactggtttccatag 51269 
29 VPKFLFSHWFP* 
dplORF220 

6334 gtgaagttttcttcggtgacggttgatacaatttccttcaagagtaagctgttaaggtggcaagtgaattctttcttcgaaact 
1 VKFSSVTVDTISFKSKLLRWQVNSFFET 
6250 ttcttgccagcagatgcgtacatgatgtcttcataa 6215 

29 FLPADAYM MSS* 

dplORF221 

43507 atgactgctcaagttctatgtaccatgctctccgctcagccggagcttcaagtgccggatgggcagtcaatactgagtacatgc 
1 MTAQVLCTMLSAQPELQVLDGQSILSTC 
43591 acgcatggcttattgaaaacggttatgaactaa 43623 
29 THGLLKTVMN* 
dplORF222 

13212 gtgacggtatcgagaaccttatggattggctcgaaaatgataccaatttcttctcaagtccagcaagcactcgataccatggaa 
1 VTVSRTLWIGSKMIPISSQVQQALDTME 
13296 gctatgaaggtggacttgtcgagcactcattaa 13328 

29 AMKVDLSSTH* " — - ** 

dplORF223 *~ 

14055 atgtggtggtacctgctggatatgttcgagatgtctactacttctacagtgaagtcgctgacgtttactacaagaaagatgtcg 

1 MWWYLLDMFEMSTTSTVKSLTFTTRKMS 

14139 acgagcctgacgatgacagcgacattcttgtag 14171 

29 TSLTMTATFL* 
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dplORF224 

13621 atgccagaaaattgcttgagcttcaactggcgtgagttgaatgaaacgttgaagaaggaaattagattttgcaccatgtcccat 

1 MPENCLSFNWRELNETLKKE IRFCTMSH 

13S37 tgtaagttgctcagggtcgtattcatatgctaa 13505 

29 CKLLRVVFIC* 
dplORF225 

32991 gtgagcaacgggtgcgacgtatttcatcgcctctgccatgtcgctagtttctgcgttcgtatcagctgctgctcgagcaaatac 

1 VSNGCDVFHRLCHVASFCVRISCCSSKY 

32907 gtcagccacgtgacccgcctggtttgcctctaa 32875 

29 VSHVTRLVCL* 
dplORF226 

25191 gtggctgcgtacattagtttgaacttcagtgagcgcaagttgcttagcagaaagttcatcgctaggaattggatagtggtgttc 

1 V A A Y I SLNFSERKLLSRKF IARNWIVVF 

25107 gatagtcattgtcgtaagtgtttgataacttga 25075 

29 DSHCRKCLIT* 
dplORF227 

23115 atgactcaattagatggtagcgcttatgacgtttcgagaatccataaaggccgaaggttgttgcattatagataccaaagtcgc 

1 MTQLDGSAYDVSRI HKGRRLLHYRYQSR 

23031 ctgctacgaataaacggtcgaattctatattga 22999 

29 LIiRINGRILY* 
dplORF228 

10450 atgttcgaaacattattgaagattctagatacaagtctatggacagcgagttcaaagtttacatcattgacgaggttcatatgc 

1 MFBTLLKILDTSLWTASSKFTSLTRFIC 

10534 tttcaaccggagcatttaatgcgctgttga 10563 

29 FQPEHLMRC* 
dplORF229 

27634 atgtgcgagttaagaaaactgattttaatcaaaceactcgaagcattgtcgcaattcctgaccactacgttgctttggctgctc 

1 MCELRKLX LIKPLEALSQFLTTTLLWLL 

27718 aaattccagctaccgcagcaactcaagtag 27747 

29 KFQLPQQLK* 
dplORF230 

50723 gtgacgaaaaatccggcatacttgaactatctgtcgttaaaaaccgatatggcgaagaccgaaaaatcatcgaatatatgtggg 

1 VTKNPAYLNYLSLKTDMAKTEKSSNICG 

50807 acgttgaaactggaacctatactcttatag 50836 

29 TLKLEPILL* 
dplORF231 

31071 atgcgcgtgtcattgcgtttcacatcttcagttccctccgaggtcacggcttcgagttctgctgtttctgccgtatctacgaca 

1 MRVSLRFTSSVPSEVTASSSAVSAVSTT 

30987 aagttagctccgccgacttttggcaactga 30958 

29 KLAPPTFGN* 
dplOR7232 

29385 atgtcaattccattagctcttgctaattcaacgagctcaggaacggttttagccgcatactcttcgcgcatttgttcaacttcg 

1 MS I PLALANSTSSGTVLAAYSSRI CSTS 

29301 tcaatttcttcaactgattcaattgtttga 29272 

29 SISSTDSIV* 
dplORF233 

52892 atgtcttcgccttccgggtcatcatacaatagagtgacaattgcgctgtcaccgtggtcagcgagtgtgaaaaactcgttatta 

1 MSSPSGSSYNRVTIALSPWSASVKNSLL 

52808 gaccctgagctaaatgttcctgattttega 52779 

29 DPELNVPDF* 
dplORF234 

36253 atgcttacgagtacagcgactcaactgttcgaaaggtttataagtttcaacccgctttgggaggcgatagcttacctaacccag 

1 MLTSTATQLFERFISFNPLWEAIAYLTQ 

36337 gaagacctactcgacaatttagagtag 36363 

29 EDLLDNLE* 
dplORF235 

32768 atgaaatcatggacgctatgccaggggtacttgacctggctgccgtatctggaggagatgtggccgcgagctccgaggccatgg 

1 MKSWTLCQGYLTWLPYLEEMWPRAPRPW 

32852 ctagttcacttcgagcctttggattag 32878 

29 LVHFEPLD* 
dplORT236 

37528 acgttcgtcgcttttagatttagcaatatatcgaggcttcatgtggcgtgtagtaaaccacgaaacatcaatgagatattcact 

1 MFVAFRFSNISRLHVACSKPRNINEIFT 

37444 tccattgttgatagaagcaaacgttaa 37418 

29 SIVDRSKR* 



dplORF237 ........ 

1678 gtgagagtccaggtaaggaatcttgacatattctcagccgtagttctaaatccaaatagaactcgcttggtgtcaacnrgcattt 
1 VRVQVRNLDIFSAVVLNPNRTRLV-STAF 
1594 gctaaagcgattggttcattcccttga 1568 

29 AKAIGSFP* 
dplOR7238 

1301 atgcctttttgcggtcgatacaagttgcgcaagttccacaactttcagcgtcactttcataacatgaacgagtcaagaaataag 
1 MPFCGRYKLRKFHNFQRHFHNMNESRMK 
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1191 



1217 gaacatctaaatcaattccccatttaa 

29 EHLNQFPI* 
dplORP239 

26521 a tggtgaagtatttcctatcgaagaatgtcctttcgaccaccctaatggaatgtgctaccaaactgtatggtacgaaaactcac 

1 MVKYFLSKMVLSTILMECATKLYGTKTH 

26605 tcgaagaaatcgctgatgagttga 26628 

29 SKKSLMS* 
dplORF240 

41893 atgtttggaataagcgtgaaacagagtttacatggcgaagtaacaaatacgaggacaaccctacgggaactcgaggtgaatggg 

1 MFGISVKQSUHGEVTNTRTTLREIiBVNG 

41977 gactatttcaaaatttctggttag 42000 

29 DYFKISG* 
dplORF241 

47020 gtgtctttccttaatatggagatagttttcattctatttaagcaggatatcgaaaaggttaccaattttagatttcataggctt 

1 VSFLMMEIVFILFKQDIEKVTNFRFHRL 

46936 accatctacgatataatctgctaa 46913 

29 TIYDIIC* 
dplORF242 

41338 gtgtctgtaacccatgctcttacggtagcggagccattaaagttcatcatacccaatttgccgccgttttcgttgatagcttgg 

1 VSVTHALTVAEPLKFI I PNLP PFSLIAW 

41254 tttttacctacgagctcagcgtga 41231 

29 FLPTSSA* 
<SplORT243 

51306 atgttccaaaattccttttcagccactggtttccatagaaccctccatcgtttcgacctaatacattcgagacgaattcagtta 

1 MFQNSFSATGFHRTLHR FDLIHSRRIQL 

51222 gtcctgaagtgtagccgcaagtga 51199 

29 VLKCSRK* 



29 vij&vok*- 
dplORF244 

27083 gtgaggtacaaaatgttgaccgtcgccgtcaatgaaaattttagcatcgagttctttcgaagttttcgaaataatttccttcac 

1 VRYKMLTVAVNEKFSIEFFRSFRNNFLH 

26999 ctgtttgatagttggttcatctag 26976 

29 LFDSWFI * 
dplOR7245 

6278 gtggcaagtgaattctttcttcgaaactttcttgccagcagatgcgtacatgatgtcttcataactgctagtagaagttttaat 

1 VASEFFLRNFLAS RCVHDVFITASRSFN 

6194 tcgaagtcggtctttcaagaataa 6171 

29 SKSVFQE* 
dplORF246 

2831 atggagtatcttgcaacccgtcacgttctgcgtcctcgcctaatagaccaaaaagtctttgaacggctgcctcagtattgtcca 

1 MEYLATRHVLRPRLIDQKVFERLPQYCP 

2747 aggttacaatttcatccggcttaa 2724 

29 RLQFHPA* 
dplORF247 

29641 gtgacgcagactactggaaacaaatggcgcaattctattatgaccaatataagcaagaacagcttgaaactgatgaaaagtcga 

1 VTQTTGNKWRNS IMTNISKNSLKLMKSR 

29725 acgctggttcgacaatcttaa 29745 

29 TLVRQS* 
dplORF248 

53560 gtgcaaagcctcgttctagcaagaagaacgatgctcagttacttgctcaacggaaaaacaggaagcctgcagttgaggttactt 

1 VQSLVLARRTMLSYLLNGKTGSLQLRLL 



53644 acatttcaggaaacgctctaa 53664 
29 T F Q E T L * 



29 T e U JS r u 
dplORF249 

2012 gtggatgcgactatcattgcaactggtgtgactcagcctttacctggaacggtactactgagccggaatatatcacaggcaaag 

1 vDATIIATGVTQPLPGTVLIjSRNISQAK 



2096 aagctgctagtcgaatcttga 2116 

29 KLLVBS* 



dp!ORF250 

23837 atgggcaaacatggaagattgacgaagactcagtcgactacaaacctactcgagaaattcgaaactatattcgacaacttatca 

1 MGKHGRLTKTQSTINLLEKFET I FDNLS 

23921 aaaagcaatcacgctttatga 23941 

29 K S N H A L * 
dplORF251 

39205 atggaaataattagtcttaccgtctgcgcctggcttcccgggtatccctcgagctccgtcattccccttccatttcgtccatgt 

1 MEIISLTVCAWLPGYPLSSVIPLPFRPC 

39121 ataggctgcagggtcttttga 39101 

29 I G C R V F * • 

dplORF252 — - ~ 

54771 gtgttgtataggtcgaaactaattttgcatattttctatatttcaaaagtgcttttgagatatcgttatcaaaatgctcgacaa 

1 VLYRSKLI LHIFYI SKVLLRYRYQNARQ 

54687 tactttcgcctgttcctctag 54667 

29 YFRLFL* 

dplORF253 

56255 atggttgcgtctataatagaaccgatgttgctagacaaagcatttgcaatcttcgagtctaatttattcgagagcttgtcgaat 
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1 MVASIIEPMLLDKAFAIFESNLFESLSN 
56171 ataaagacacttgctttttga 56151 
29 IKTLAF* 
dplORF254 

48479 atgaacctttcgcttaggttcaatctttttcgaacattttcatatttaacaaaactttcagctaaaaatcgacaaagttcaatg 
1 MNLSLRFNLFRTFSYLTKLSAKNRQSSM 
48395 ttcgactcaatgtttaaataa 48375 
29 FDSMFK* 
dplORF255 

9572 atgctttggtcttctcgacgaatgactctactacattccctgcagggtttcgagcagtacgggtcaatgatgcaccgttttcgt 
1 MLWSSRRMTLLHSLQGPEQYGSMMHRFR 
9488 caaggtagtcaccttttctaa 9468 

29 QGSHLF* 
dplORP256 

15289 atgaccttccagtcactaatgcggccgctgaaattggataccactatacatgggttcaccaacttcgagacaaagcagttgaaa 
1 MTFQSltMRPLKLDTTIHGFTNFETKQLK 
15373 cacttgaagaaattttag 15390 
29 H L K K F * 

dplORF257 

28216 gtgaacgtgctggatttagcaaacaagctactgagatggcattcttccgtgagtctatgcgacccggtgaaaaagaccgtcaaa 
1 VNVLDLANKLLRWHSSVS LCD LVKKTVK 

28300 acttgcaaatgctattga 28317 
29 T C K C Y * 

dplORF258 

44023 atggaaattggtattggttcgaccgtgacggatacatggctacgtcatggaaacggattggcgagtcatggtactacttcaatc 
1 MEIGIGSTVTDTWLRHGMGLASHGTTSI 
44107 gcgatggttcaatggtaa 44124 
29 A M V Q W * 

dplORF259 

4298 atgactcgactacgaagcataaagacaagtggatggaaagagtattcgaagttattcgaaacagttctaatccagacgttaaga 
1 MTRLRSIKTSGWKEYSKLFETVLIQTLR 
4382 ctcacgcatttgggatga 4399 

29 L T H L G * 

dplORF260 

24746 gtgaccctacttcctcaatcggcggtactggaggcaagcaagctcaagtcacttccatttcaggaaacttcaacttccttccag 
1 VTLLPQSAVLEASKLKSLPFQETSTSFQ 
24830 cggctgaatattatttag 24847 
.29 R L N I I * 

dplORF261 

288 atgaattcacttccctttgccctaaaacaggacagcctgacttcgcgaatgttttcattagttacattccaaacgaaaagatgg 
1 MMSLPFALKQDSLTSRMFSLVTFQ TKRW 

372 ttgaatctaaatcattga 389 

29 L N I* N H * 

dplORF262 

9408 atgcctattcaactccaggcggaaagatgtggaagcatgcttgtgcagttcgacttaaatttagaaaaggtgactaccttgacg 
1 MP I QLQAERCGSMLVQFDLNLEKVTTLT 

9492 aaaacggtgcatcattga 9509 

29 K T V H H * 

dplOR7263 

27052 atgaaaattttagcatcgagttctttcgaagttttcgaaataatttccttcacctgtttgatagttggttcatctagacctttt 
1 MKILASSSFEVFEIISFTCliIVGSSRPF 
26968 aacaagtcttctaattga 26951 
29 n K S s N * 

dplORF264 

6139 gtgaatagtacaaggcggtctaatacgctcaggatttctgctgtagggatagccgcatcatcttcaaactcaattgagtcaagc 
1 VNSTRRSNTLRISAVGIAASSSNSIESS 
6055 tgtgaaacgtcttcataa 6038 

29 C E T S S * 

dplORF265 

4 601 gtgaataaagtcaagcgtttttgtataaaaagttcatttttttttaaaaaaaataagagcgaaaagctcttatctaaaatagtc 
1 VNKVKRFCIKSSFFFKKNKSEKLLSKIV 
4717 gacgttgacgatttttaa 4700 

29 D V D D F * 

dplORF266 

50220 atgcccgttcttccaagcagttgcaagcattttatcaatagtccacgacttaccttgtccaggtcgagccattatgacaatcaa 
1 M P V LPSSCKHFINS PRLTLSRSSHYDNQ 

50136 atcctcaccaggaagtaa 50119 ... . . .. 

29 I I* T R K * " — 

dplORF267 - 

47367 atggtcaaggtctgttctaggttcaggaagaacaaacgggaagtgaatgttattttcttcagcgaagtcttttgcttcatacca 

1 MVKVCSRFRKNKREVNVI FFS EVFCFI P 

47283 aacattaatcgtagatag 47266 

29 N I N R R « 

dplORF268 
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12521 atgtcaatttcggtctcgtgcttgacaatggattcaactactgatgcgccaacctttttcaatcgcgacagcttgtccaattca 
1 MSISVLCLTMDSTTDASTFFNRDSLSNS 
12537 ttgtcaattctagagtaa 12520 
29 L S I L E * 

dplORF269 

53834 gtgaatagtatcgagtccatcagtttctacgtcaatagaacctattccgtcttcaatcattttgtctacatactgctcgagttt 
1 VNSIESISFYVNRTYSVFNHFVYILLEF 
53750 tgcttcctcagtgattaa 53733 
29 C F L S D * 

dplORF270 

50792 atgatttttcggtcttcgccatatcggtttttaacgacagatagttcaagtatgccggatttttcgtcacgcttcatagcgata 
1 MIFRSSPYRFLTTDSSSMPDFSSRFIAI 
50708 actctgctagcattttga 50691 
29 T L L A F * 

dplORF271 

19739 atgaggctgctttgctttatcttcgttaccgtattgaccgacttcctactcgcgaaccttcctacaagaattcatacctcaaag 
1 MRLLCFIFVTVLTDFLLANLPTRIHTSK 
19655 gctttttgtcagccttag 19638 
29 A F C Q P * 

dplOR7272 

1556 gtggtcaagtctgtcaatgaatgtacctgcgattttcctgacgtgataaaagtcaacaaccatcccttgactcgaaccgtggtc 
1 VVKSVNECTCDFLDVI KVNNHPLTRTVV 

1472 ataagttccgcctgctaa 1455 

29 I S S A C * 

dplOR7273 

56256 atggatttcattaggactgagtcctcttggaattggaacggttgcatatatagatattccgtcagccgtactaggccaagttct 
1 MDFIRTESSWNWNGCIYRYSVSRTRPSS 
56340 agttcagtttatcctgcagtcaattgcttcgagatatttgaaaaagtagtcaggaaaattcctgattatcttgcagtcaattgc 
29 SSVYLAVNCFEI FEKVVRKI PDYLAVNC 

56424 ttcgagatatttgaaaaagtagtcaggaaaattcctgattattttttttacaaaaacgcttga 56486 
57 FBI FEKVVRKIPDYFFYKNA* 



Patent provided by Sughaie Mion, PLLC - http://www.sughrue.com 



WO 00/32825 



PCT/IB99/02040 



398 
Table 31 



Query* aid | 114822 | lan |dplORF0 01 Phage dpi ORF| 36698-40390 | 2 
(1230 letters) 

>gi | 928828 (L44593) ORP1904; putative [Lactococcus lactis phage BK5-TJ 
Length -1904 



Score o 427 bits (1086) , Expect =• e-118 

Identities = 226/475 (47%), Positives « 281/475 (58%), Gaps 



45/475 (9%) 



Query: 


395 


Sbjct : 


620 


Query: 


455 


Sbjct: 


874 


Query: 


51S 


Sbjct: 


934 


Query: 


575 


Sbjct: 


994 


Query: 


631 


Sbjct: 


1054 


Query: 


691 


Sbjct: 


1114 


Query: 


751 


Sbjct: 


1168 


Query: 


810 


Sbjct: 


1220 


Score 


■ 3! 


Identities 


Query: 


421 


Sbjct: 


1155 


Query: 


481 


Sbjct: 


1212 


Query: 


541 


Sbjct: 


1272 


Query: 


598 


Sbjct; 


1332 


Query: 


657 


Sbjct: 


1392 


Query: 


717 


Sbjct: 


1452 



ABSGKYIGVXNTNKKPSELVPDDFTWI RLEGPKGDAGLPGAPGRDGVDGVPGKSGVGI AD 454 
A+ YIG + P D+TW + +G+ G GA G+DGV GK GVGI 
ADYPSYIGQYTDFIQYDSAKPSDYTWSLI - - - RGNDGKDGATGKDGV AGKDGVGIKT 873 

TAITYAVSVSGTQEPENGWSEQVPELIKGRFLWTKTFWRYTDGSHETGYSVAYIGQDGNS 514 
T ITYA+S SGT +P GW+ QVP L+KG++LWTKT W YTD S ETGYSV YI +DGN+ 
TVTTYALSSSGTDKPNTGWSQVPTLVKGQYLWTKTVW^ 933 

GKDG I AGKDGVG I AATEVMYAS S PSATEAPAGGWSTGVPTVPGGQYLWTRTRWRYTDQTD 574 
G DG I AGKDGVG I T + YA ST APA GW++QVP VP GQ+LWT+T W YTD T 
GNDGIAGKDGVGIKIOTITYAVGTSGTTAPASGWNSGVPNVPAGQF^ 993 

E IG YS VSRMG EQG PKGDAGR - - -DGIAGKNGIGLKSTSVSYGISPTDSAIP-GVWASQVP 630 
E GYSV+ MG +G KGD G +GIAGK+G G+K+T+++Y SP + PGW++VP 
ETGYS VAMMGVKGDKGDPGNNGTNG I AGKDGKG I KATA I TYQAS PNGTTAPTGTWSASVP 1053 

S LI KGQ YLWTRT I WTYTDSTTETGYQKTYI PKDGNDGKNG I AGKDGVG I KSTTITYAGST 690 

+ KG + LWTRT I WTYTD +TTETGY Y+ +GN+G +G GKDG GIK+TTITYAGST 
P VAKG S FLWTRT I WTYTDNTTETG YAVA YMGTNGNNGHDG F PGKDGTG I KTTTITYAG S T 1113 

SGTVAPTSNWT S A I PNVQPG P F LWTKTVWNYTDDTS ETGYS VS K I G ETXXXXXXXXXXXX 750 
SGT P + WTS +P V G +LWTKTVW YTD+TSETGYSV+ +G 

SGTTPPNNGOTSTVPTVAEGNYLWTKTvTfTYTDNTSETGYSW VKGDKGDP 1167 

XXXXXXXXXXADG RS - Q YTHLAFS NS PNG EG FSHTD SG RA YVGQYQD FN PVHS KD PAAYT 809 

DG+ + T ♦ ♦ SPNG A G + P +K +T 

GNNGTNG I AG KDG KG I KATA I TYQAS PNGT TAPTGTWSASVPPVAKGSFLWT 1219 



WTKW- 
T W 



- KGNDG AQG I PG KPGADG KTNYFH I A YAS SADG S 646 
GN+G G PGK G KT I YA S G+ 



* e-109 



Gaps o 42/449 (9%) 



IRLEGPKGDAGLPGAPGRDGVDGVPGKSGVGIADTAITYAVSVSGTQEPENGWSEQVPEL 480 
+ + G KGD G PG +G +G+ GK G GI TAITY S +GT P WS VP + 



IKGRFLWTKTFWlYTDGSHF/rGYSVAYIGQDGNSGKD^ 540 
KG FLWT+T W YTD + ETGY+VAY+G +GN+G DG GKDG GI T + YA S S 



T EA P AGGWSTQ V PTVPGGQ YLWTRTRWRYTD QTD EI GY SVS RMG EQG P KGD AGR DGI 597 

T P GW++ VPTV G YLWT+T W YTD T E GYSV+ MG +G KGD G +GI 



AGKNG I GLKSTS VS YG I S PTDSAI P - GVWASQ V P S LI KG Q YLWTRT I WTYTDS TT ETGYQ 656 
AGK+G G+K+T+++Y SP + P G W++ VP ♦ KG +LWTRTIWTYTD+TTETGY 



KTYI PKDGNDGKNG I AGKDGVGIKSTTITYAGSTSGTVAPTSNWTSAIPNVQPGFFLWTK 716 
y+ +GN+G +G GKDG GIK+TTITYAGSTSGT P ♦ WTS +P V G +LWTK 



TWNYTDDTSETGYSVSKIGETXXXXXXXXXXXXXXXXXXXXXXADGR^ 775 
TVW YTD+TSETGYSV+ +G DG+ + T + ♦ S 
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Query: 




Sbjct: 


1311 


Query: 


481 


Sbjct: 


1368 


Query: 


541 


Sbjct: 


1428 


Query: 


598 


Sbjct: 


1488 


Query: 


657 


Sbjct : 


1548 


Query: 


717 


Sbjct: 


1608 



399 

Query:. 776 PNGEGFSHTDSGRAYVGQYQDFNPVHSKDPAAYTWTKW- - - KGND 817 

PNG A G + P +K +T T W GN+ 

Sbjct: 1506 PNGT TAPTGTWS AS VPPVAKGSFLWTRTI WTYTDNTTETG YAVAYMGTNGNN 1557 

Query: 818 GAQGI PGKPGADGKTNYFHI AYASSADGS 846 

G G PGK G KT I YA S G+ 
Sbjct: 1558 GHDGFPGKDGTGXKTT- -TITYAGSTSGT 1584 

Score o 384 bits (977), Expect « e-105 

Identities = 179/322 (55%), Positives - 222/322 (68ft), Gaps = 7/322 (2ft) 

IRLEGPKGDAGLPGAPGWXJVIXJVPGKSGVGIADT 480 
+ + G KGD G PG +G +G+ GK G GI TAITY S +GT P WS VP + 
VAMKGVKGDKG DPGNNGTNGIAGKDGKGI KATAITYQASPNGTTAPTGTWSAS VPPV 1367 

IKGRFLWTICrFWYTDGSHETGYSVAYIGQDGNSGKDGI AGKDGVGI AATEVMYASSPSA 540 

KG FLWT+T W YTD + ETGY+VAY+G +GN+G DG GKDG GI T + YA S S 
AKGS FLWTRTIWTYTDNTTETGYAVAYMGTNGNN 1427 

TEAPAGGWSTQVPTVPGGQYLWTRTRWRYTDQTDEIGYSVSRMGEQGPKGDAGR- - -DGI 597 
T P GW++ VPTV G YLWT+T W YTD T E GYSV+ MG +G KGD G +GI 
TTPPNNGWTSTVPTVAEGNYLWTKTVWTYTDNTSff 1487 

AGKNGIGLKSTSVSYGISPTDSAI P - GVWASQVPSLI KGQYLWTRTIWTYTDSTTETGYQ 6 56 
AGK+G G+K+T+++Y SP + P G W++ VP + KG +LWTRTIWTYTD+TTETGY 
AGKDGKG I KATAITYQAS PNGTTAPTGTWS ASV P P VAKG S FXWTRTI WTYTDNTTETG YA 1547 

KTYI PKDGNDGKNGIAGKDGVGIICSTTITYAGSTSGTVAPTSNWTSAI PKVQPGFFLWTK 716 
Y+ +GN+G +G GKDG GIK+TTITYAGSTSGT P + WTS +P V G +LWTK 



TVWNYTDDTS ETG YSVSKIGET 738 
TVW YTD++ ETGYSV K+G T 



Score » 201 bits (507), Expect = 2e-50 

Identities = 121/297 (40ft), Positives = 156/297 (51ft), Gaps = 19/297 (6ft) 

Query: 421 IRIiEGPKGDAGLPGAPGRDGVDGVPGKSGVGIADTAITYAVSVSGTQEPENGWSEQVPEL 480 

+ + G KGD G PG +G +G+ GK G GI TAITY S +GT P WS VP + 
Sbjct: 1467 VAMKGVKGDKG DPGKNGTNGI AGKDGKGIKATAITYQASPNGTTAPTGTWSASVPPV 1523 

Query: 4 81 I KGRFLWTKTFWRYTDGSHETGYSVAYIGQDGNSGKTC 540 

KG FLWT+T W YTD + ETGY+VAY+G +GN+G DG GKDG GI T + YA S S 
Sbjct: 1524 AKGSFLWTRTI WTYTDNTTETGYAVA YMGTNGNNGHDGF PGKDGTG I KTTTITYAGSTSG 1583 

Query: 541 TEAPAGGWSTQVPTVlKX^YLWTRTRWRYTOG/n)EIGYSVSRMGEGGPKGDAGRDGIAGK 600 

T P GW++ VPTV G YLWT+T W YTD + E GYSV +MG GP AG +G GK 
Sbjct: 1584 TTPPNNGWTSTVPTVAEGKYLWTKTVWAYTDNS FETG YSVGKMGNTG P AGSNGNPGK 1640 

Query: 601 NGIGLKSTSVS YGISPTDSAI PGWASQVPSLI KG-QYLWTRTIWTYTDSTTE- - TGYQK 657 

+ T+ G++ S++ ++G+YW W+ G 

Sbjct: 1641 WSDTE PTTKFKGLTWKYSGWDMPLGNGTK I LAGTEYYWNGNNWALYEINAHNINGDNL 1700 

Query: 658 TYIPKDGNDGK-NGIAGKDGVGIKSTTITYAGS TSGTVAPTSNWTS AI PNVQ 708 

+ DGK I G +GV +TTGS +S+ TNTAINQ 

Sbjct: 1701 S VTNGTFKDGKI ES I WGSNG V NGTTTI EGSHLQI HS SDSTTNTEN - TLAIDNRQ 1753 



Query= sid| 114823 | lan |dplORF0 02 Phage dpi ORF| 32386-35835 | 1 
(1149 letters) 

>dbj|BAA31888| (AB009866) orf 15 (bacteriophage phi PVL) 
Length « 694 

Score o 280 bits (709) , Expect » 3e-74 

Identities = 157/465 (33ft), Positives = 257/465 (54ft), Gaps a 28/465 (6ft) 

Query: 40 QIGSALTGLGKGLTTAVTLPLMGFAAAS I KVGNEFQAQMSRVQAI AGATAEELGRMKTQA 99 

+IG+++ +G+ +T VT P++ A + K G EF M +V+A +GAT EE +K +A 
Sbjct: 151 EIGNSMKNVGPJfMTMYVTAPWAGFAVAAKKGI^ 210 
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Query: 


100 


Sb j ct : 


211 


Ouerv : 


160 


Sb jet : 


271 


Ouerv : 


220 


Sbjct : 


331 


Query : 


280 


CH-ir>t- • 
OfiJCv • 


391 


Query: 


340 


Sbjct: 


451 


Query: 


400 


Sbjct: 


511 


Query: 


442 


Sbjct: 


571 



400 

I DLGAKT A FS AKEAAQG MENLAS AG FQVNE I MDAMPGVTiDLXXXXXXXXXXXXXXMAS S L 
++GA T FSA ++A+ ♦ +A AG+ ++M+ + GV+DL + L 

REMGATTKFSASDSAEALNYMALAGWDS KQMMEGLSGVMDLAAASGEEI/3AVSD I VTDGL 

RAFGLEANQAGHVADVFARAAAirrNAETSDMAEAMKYVAPVAHSMGLSLEETAAS IGIMA 
AFGL+A +GH+ADV A+ ++ N ♦ + EA KYVAPVA ++G ++E+T+ +IG+M+ 



+ AG I KG +AGT LR + ♦+ PT+AM M+ LG+S D+NG MIP+R+ + QL+ 
NAGIKGEKAGTAIjRTMFTNI^SPTRAMGNEMERIiGISITDS^ 390 

GLTQEERNRHLVTLYGQNSLSGMIJUJjDAG PEKLDKMTNAL 339 

L+++++ T++G+ ++SG LA+++A E K+T ++ +S GA+K MA+TM+ L 

HI^KDGXJASSAATIFGKEAMSGAIAIINASDEDYQKLTKSIDSSTGASKRMA^ 450 

SKIEQMGGAFESVAI IVQQI LEPALAKIVGAITKVLEAFVNMSPIGQKMVVIFAGMVAAL 399 
K+ + E +A+ ♦ +EPAL IV A +KV+ + Q W F VA L 



GPLLLIAGM VMTTI VKLRIAIQFI/3PAFMGTMGTI AGVI AI F 441 

GPL-f + G+ MT + L I ♦ F IA +♦ +F 



ALV ♦ F AY +SB FRN +N + FA 



Query=* sid| 114824 | lan|dplORF003 Phage dpi ORF| 53538-55877 |3 
(779 letters) 

>sp|P4374l|DP01_HAEIN DNA POLYMERASE I (POL I) >gi 1 1074025 | pir| | E64098 DNA polymerase I 
(polA) homolog - Haemophilus influenzae (strain Rd KW20) 
>gi | 1573871 (U32767) DNA polymerase I (polA) 
[Haemophilus influenzae Rd} 
Length « 930 

Score « 191 bits <481) , Expect « le-47 

Identities « 148/553 (26%), Positives » 262/553 (46%), Gaps = 60/553 (10%) 



Query: 63 RLELITEEAKJ^QYvDKMIEIXSIGSIDVOT^ 

. . .*t ..i.v. ^ u-j.n trrn T.n * T. + + Y P+ 



+ E + +A li + + ♦ ++U tlU t XJ \»T -r -r * - ' 

Sbjct: 333 KYETIJjTQADLTRWIEKLNAAKLIAVDTETDS 392 

Query: 123 SNMTKMRI KNQISPEFMKKMLQRI VDSGI PVI YHNSKFDMKS I YWRIX3VKMNBPAWDTYL 182 

+ + + +K +L+ + I I N KFD +SI+ R G+++ +DT h 
Sbjct: 393 YLDAPKTLEKSTALAAIKPILE- - -NPNIHKIGQNIKFD-ESIFARHGIELQGVEFDTML 448 

Query: 183 AAMLI^ENESHSLKSUISKYVRNEENAEVAKFNDLFKGIPF 242 

♦ LN H++ L +Y+ +E A + ♦ F+ IP ♦ A YAA D T 

Sbjct: 449 LS YTLNSTGRHNMDDLAKRYLGHETIAFESLAGRG KSQLTFNQ I PLEQATE YAAEDADVT 508 

Query: 243 FELYEFQEQYLTPGTEG^EEYNLEKVSWVLHNI BMPLI KVLFDMEVYGVDLDQDKLAE IR 302 

+L + E Y +E+PL+ VL ME GV +D D L 

Sbjct: 509 MKLQQALWLKIjQEE PTLVELYK TMELPLLHVLSRMERTGVLIDSDALFMQS 559 

Query: 303 EQFTANMNEAEQEFG^LVSEWQPEIEEIJlffrNFQSYQKLEMDARGRvTV^ 362 

+ + + E++ L + QL + 

Sbjct: 560 NE I ASRLTALEKQAYALAGQ - PFNLASTKQLQEI 592 

Query: 363 FYDIMGLKSPERDKPRG- --TGESIVEH- -FDNDISXXXXXXXXXXXXVSTYTT- LDQHL 416 

+D + L ++ P+G T E ++E + +++ STYT L Q + 

Sbjct: 593 LET)KLELPVLQKT-PKGAPSTNEEVLEELSYSHELPKI^ 651 

Query: 417 AKPDNRIHTTFKQYGAKTGRMSSENPNWJNI PSRGE - GAWRQI FAASEGHYI IGSDYSQ 475 

R+HT++ Q TGR+SS +PNLQNIP REG +RQ F A EG+ 1+ +DYSQ 
Sbjct: 652 NSQTGRVHTS YHQAVTATGRLSSSDPNI^NI P I RNEEGRHI RQAF I AREGYS I VAADYSQ 711 

Query: 476 QEPRSLAELSGDESMRHAYEQNIJDLYSVIGSKLYGVPYEECLEFYPIX3TTNKEGKLRRNS 535 

E R +A LSGD+ + +A+ Q D++ ++++GV +B T+++ R + 
Sbjct: 712 IELRIMAHLSGDQGLINAFSQGKDIHRSTAAEIFGVSLPE VTSEQ RRN 759 

Query: 536 VKSVLLG LMYGRGANS I AEQMNVSVKEANKVI EDFFTEFPKVADYI I FVQQOAQDLGYVQ 595 
K++ GL+YG A ++ Q+ +S +A K ♦+ +F +P V ++++A+ GYV+ 
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Sbjct: 760 AKAINPGLIYGMSAFGl^RQI/3ISRADAQKYMDLYFQRYPSVQQFTnT>IRElKAKAQGY^ 819 

Query: 596 TATGRRRRLPDMS 608 

T GRR LPD++ 
Sbjct: 820 TLFGRRLYLPDIN 832 

Score a 46.9 bits (109), Expect » Se-04 

Identities = 34/123 (27%), Positives = 66/123 (53ft), Gaps = 16/123 (13%) 

Query: 663 EIKDQAKAEGI LI KDNGGKI ADAQRQCLNS VI QGTAADMTKYAM I KV 709 

+I+++AKA+G + N + A+R +N+ +QGTAAD+ K AMIK+ 

Sbjct: 807 D I REKAKAQGYVETLFGRRLYL PD I NS S NAMRRKG AERVA I NAPMQGTAAD 1 I KRAM I KL 866 

Query: 710 HNDAELKEI^FHI^IPVHDEI^EVPIK^AKRGAERLTEVMIEAAKDIISLPMKCDPSIV 769 

++ + +++ VHDEL+ EV + E++ + M EAA +++ + P+ + + 

Sbjct: 867 -DEVIRHDPDIEMIMQVHDELVFEVRSEKVAFFREQIKQHM-EAAAELV-VPLIVEVGVG 923 

Query: 770 ERW 772 
+ w 

Sbjct: 924 QNW 926 

Query= sid| 114825 ( lan | dplORF004 Phage dpi ORF| 40401-42440 |3 
(679 letters) 

>emb|CAB0798l| (Z93946) hypothetical protein (bacteriophage Dp-1] 
Length » 532 

Score « 1011 bits (2585), Expect « 0.0 

Identities = 497/499 (99%), Positives ■ 498/499 (99%) 

Query: 1 MTKFINSYGPLHLNLYv^QVSQDVTNNSSRV^WRATvT)RDGAY^ 60 

hH'KFINSYGPLHLNLYVEQVSQDVTNNSSRVSWRATvTDRDGAYRW 
Sbjct: 1 MTKF I NS YG P LHLNL YVEQVSQD VTNNS S R VS WRATVDRDGAYRTWTYGN I SNLSVWLNG 60 

Query: 61 SSVHSSHPDYDTSGEEVTLASGEVTVPHNSDGTKTMSVWASFDPNNGW^ 120 

SSVHSSHPDYDTSGEEVTLASGEVTVPHNSDGTKTMS^ 
Sbjct: 61 SSVHSSHPDYDTSGEEVTLASGEVTVPHNSDGTKTMSv^^ 120 

Query: 121 DS I PRSTQI S S FEGNRNLGSLHTVI FNRKVNS FTHQVWYRVFGSDWIDLGKNHTTS VS FT 180 

DS I PRSTQI S S FEGNRNLGSLHTVI FNRKVNS FTHQVTfreVFGSDWIDLGKNHTTSVSFT 
Sbjct: 121 DS I PRSTQI SSFEGNRNLGS LHTVI FNRKVNS FTHQVlfniWGSDWIDI/3KMHTTSTVSFT 180 

Query: 181 PSLDLARYLPKSSSGTMDI CIRTYNGTTQIGSDVYSNGWRFNI PDSVRPTFSGISLVDTT 240 

PSLOLARYLPKS SSGTMDI CIRTYNGTTQIGSD VYSNGWRFN I PDSVRPTFSG I SLVDTT 
Sbjct: 181 PSLDLARYLPICSSSGTMDICIRTYNGT^IGSDVYSNGWRFNIPDSVRPT^ 24 0 

Query: 241 SAVRQILTGNNFLQIMSNIQVNFNNASGAYGSTIQAFHAELVGKNQAINENGGKLGMMNF 300 

S AVRQI LTGNNFLQI MSN I QVNFNNASGA YGSTI QAFHAELVGKNQAI NENGGKLGMMN F 
Sbjct: 241 S AVRQI LTGNNFLQ I MSNI QWFNNASG AYGSTI QAFHAELVGKNQAINENGGKLGMMNF 300 

Query: 301 NGSATVRAWVTDTRGKQSNVQD VS INVIEYYGPS INFS VQRTRQNPAI IQALRNAKVAP I 360 

NGS ATVRAWVTDTRGKQSNVQD VS I NV I E YYG PS INFS VQRTRQN PAI IQALRNAKVAP I 
Sbjct: 301 NG S ATVRAWVTDTRG KQSNVQD VS INV I E YYG PS INFS VQRTRQN PAI IQALRNAKVAP I 360 

Query: 361 TVGGQQKNI^ITFSVAPljNTTNFTFJDRGSASGTFT^ 420 

TVGGQQ KNI MQ ITFSVA PLNTTNFTFJDRG SASGTFTTI S L +TNSS ANLAGNYG PDKSYT V 
Sbjct: 361 TVGGQXJKNIMQITFSVAPLNTTNFTEDRGSASGTFTTISLLTNSSANI^^ 420 

Query: 421 KAKIQDRFTSTEFSATVATESVVLNYDKDGRIXSVGKVVEQX*^ 480 

KAKIQDRFTSTEFSATV TES WLNYDKDGRLGVG KWEQGKAGS I DAAGD I YAGGRQ VQ 
Sbjct: 421 KAKI QDRFTSTEFSATVPTES VVLNYDKDGRLGVGKVVEQGKAGS IDAAGD I YAGGRQVQ 480 

Query: 481 QFQLTDNNGALNRGQYNDV 4 99 

QFQLTDNNGALNRGQYNDV 
Sbjct: 481 QFQLTDNNGALNRGQYNDV 499 

Query- sid| 114827 | lan |dplORF006 Phage dpi ORF| 45296-46987 | 2 
(563 letters) 

>gb|AAD18987| (AE001666) SWI/SNF family helicase_2 [Chlamydia pneumoniae] 
Length = 1166 

Score « 171 bits (429), Expect = le-41 

Identities = 150/522 (28%). Positives * 254/522 (47%), Gaps » 55/522 (10%) 
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Query: 46 SSNNFE-LPYKYFNNVIDALDEWELHIFGELDKDVQDYIDSRKRIASSSNEQFSFKTTPF 104 

S + FE LP + ++ + L E + I GE++ D QD + T 

Sbjct: 659 SLDQFEALPVNF- - SMSERLI EIQKQI RGEI EFDFQD VPQQIQATLRSYQTEG 709 

Query: 105 AHQVECFEYAQEHPCEXLGDEQGLGKTKQAID IAVSRKASFKH - - CLI VCCI SGLKWNWA 162 

H +E + H +L D+ GLGKT QAI IAV+ + K C +♦ C + L +NW 

Sbjct: 710 VHWLE - - RLRKMHLNGILADDKGLGKTLQAI - IAVTQSKLEKGSGCSLI VCPTSLVYNWK 766 

Query: 163 KEVGIHSNESAHILGSRVTKTOKLVIDGV-SKRAEDLLGGHDEFFXITNICT 221 

+E + E LVIDGV S+R + L D IT* L+ V 
Sbjct: 767 EEFRKFNPEFR TLVI DG VPSQRRKQLTALADRD VAI TS YNLLQKDV 812 

Query: 222 YLNELTKSGEIGMVIIDEIHKCKNPSSKQGASIQKLQSYYK^LTGTPLM^PIDVF 281 

EL KS V++DE H KN +♦+ S++ +QS +++ LTGTP+ N+ ++♦+♦ 

Sbjct: 813 - - - ELYKSFRFDYWLDEAHH I KKRTTRNAKS VKMI QSDHRLILTGTPI ENSLEELWSLF 869 

Query: 282 KWLGAEHHTLTQFKERYCIVDQFHQITGYR NLAE LRELVNDYMLRRTKEEVL - DL 335 

+L L +R+ V ++ + Y N+ L++ V+ ++LRR KE+VL DL 

Sbjct: 870 DFIiMPG---LIiSSYDRF--VGKYIRTGNYMGNKADNMVALKKKVS 924 

Query: 336 PEKIRVTEYVDMNSKQSKIY KEVLTKLVQE I DKVKLMPNPLAET I RLRQATGN 388 

p + + + Q ++Y K+ L++LV++ ++ + LA RL+Q + 

Sbjct: 925 P P VS E I L YHCHLTE S QKELYQSY AASAKQ E LS RLVKQEG FERI H I HVLATLTRLKQ I CCH 984 

Query: 389 PSILTTQDVK SCKFERCIEIVEECIG^X^^^^FSNWEKVIEPIJUCIL-SICTVKCNL 444 

P+I + S K++ ♦+++ + G V+FS + K++ + K L S+ ♦ 

Sbjct: 985 PAIFA103APEPGDSAKYDMlJ©LIiSSLVDSGHKTWFSQYTKMI^IIKKDLESRGIPFVY 1044 

Query: 445 VTGETADKFNEI EEFMNHRKAS VI LGTIGALGTG FTLTKADTVI FLDS PWTRAEKDQAED 504 

+ G T ++ + + +F V L ++ A GTG L ADTVI D W A ++QA D 

Sbjct: 1045 LDGSTKNRLD LVNQ FNEDPS LL VFL I S LKAGGTG LNLVGADTVI HYDMWWNP AVENQATD 1104 

Query: 505 RCHRIGAKSSVTIYTLVAKGTVDERIEDLIERKGELADYIVD 546 

R HRIG SV+ Y LV T++E+I L RK L ♦+■► 
Sbjct: 1105 RVHRIGQSRSVSSYKLVTLNTIEBKILTLQNRKKSLVKKVIN 1146 



Query= aid) 114828 | lan |dplORF007 Phage dpi ORF| 22230-23621 | 3 
(463 letters) 

>gi | 2444105 (U88974) ORF26 [Streptococcus thermophilus temperate bacteriophage 
O1205] 

Length a 411 
Score =88.9 bits (217), Expect = 7e-17 

Identities = 80/315 (25%), Positives ■ 133/315 (41%), Gaps « 48/315 (15%) 

Query: 139 QGVTLAGIFCDEVAIiMPESFVNQATGRCSVTGSKMWFSCNP 198 

♦G T G + +E +L E + RCS G+++ + NP NPNH+ +++I K + + 
Sbjct: 121 RGFTAFGAYVNEASLANELVFKEIISRCSGDGARVVWDSNPDNPNHWLNRDYIGKN 179 

Query: 199 ILYLHFTMDDNPSLT DSIKRRYEKMYAGVFRKRFILGLWVTADGLVYSMFNEEQHV 254 

1+ F +DDN L+ DSIK K G F R ILGLW A+G +Y+ ++ + HV 
Sbjct: 180 I IDFS FKLDDNTFLSKRYIDS I KAATPK GKFYDRDI LGLWTVAEGAI YADYDSKIHV 236 

Query: 255 KKLNI EFDRLFVAGDFGIYHATTFGLYGFSKRHKRYHLI ESYYHSGREAEEQLTEADVNS 314 

E R F D+G + + + G ++L++ +B + + +A 
Sbjct: 237 VDELPEMKRYFGG IDWGYTKYGS I VI VG - EG VDNNF YLVDG VAAQFKE IDWWVEQA 291 

Query: 315 NIQFSSVLQICrTKEYANDLVDMIRGKQIEYIILDPSASAMIVELQKHPYIAR KNIPI 371 

♦K T Y N + ♦ ++AR + I 

Sbjct: 292 RKLTGIYGN I PFYADSARPEHVARFENEGFDI 323 

Query: 372 I PARNDVTLGISFHAELLAENRFTLDPSNT - HDIDEYYAYS WDSKASQTGEDRVI KEHDH 430 

+ A V GI A+L E + + DE Y Y W *■+ +D +KE D 
Sbjct: 324 MNANKS V I AG I ELI AKLFKEKKLYVKRG FVPRFFDE I YQYRWKENST KDE PLKEFDD 380 

Query: 431 CMDRNR Y ACLTD AL I 445 

+D RYA +D +1 
Sbjct: 3 81 VLDSVRYAI YSDYVI 395 

Query- gid| 114829 | lan |dplORF008 Phage dpi ORF | 49624-50961 | 1 
(445 letters) 

>gb|AAD19901| (AF100420) DnaB replication fork helicase iThermus aquaticus) 
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Length = 444 
Score = 67.5 bits (162). Expect =» 2e-10 

Identities = 69/248 (27%), Positives * 111/248 (43%), Gaps * 14/248 (5%) 

Query: 147 GERI^ISTGFE30CXXXXXXXXXXXXXIVIMARPGQGKS-WTIDKMIATAWKNGHDVXJjYS 205 

GE G+ TGF+ I I ARP GK+ + ♦ A K G V +YS 

Sbjct: 178 GEVAGVRTGFKELDQLIGTLGPGSLNI - 1 AAR P AMGKTAFALT I AQNAALKEGVGVG I YS 236 

Query: 206 GEMSEMQVGARIDTILSNVSINSITKGIWlTOHQPElCYEDHIQAMTEAENSLVVVTPmiG 265 

EM Q+R+ + + +N+G DF+D ++EA •»■ TP + 
Sbjct: 237 LEMPAAQLTLRMMCSEARIDMbHiVRLGQLTBRDFSRLv^ - 1 YIDDTPDLTL 295 

Query: 266 GKNLTPAILDSMISKYRPSWGIDQLSLMS- - ESYPSREQKRIQYANITMDLYXISAKYG 323 

+ A ++S+ + ++ ID L LMS S S E ++ + A 1+ L + G 

Sbjct: 296 ME - - VRARARRLVSQNQVGLI 1 1 DYLQLMSGPGSGKSGENRQQEI AAI SRGLKALiARELG 353 

Query: 324 IPIVXNVQAGRSAKTBGAESMBLEHIAESDGVGQNASRVIAMKRD EKSGILEL 376 

IPI+ Q R+ + + L + ES 4- Q+A V+ + RD EK+GI E+ 

Sbjct: 354 IPIIALSQLSRAVEARPNKRPMLSDLRESGSIEQDADLVMFIYRDEYYNPHSEKAGIAEI 413 

Query: 377 SWKNRYG 384 

V K R G 
Sbjct: 414 IVGKQRNG 421 



Query« sid| 114831 | lan|dplORF010 Phage dpi ORF| 8699-9859 | 2 
(386 letters) 

>gi| 2760912 (AF037258) RecA protein [Chlorobium tepidum] 
Length » 346 

Score = 133 bits (331) , Expect - 2e-30 

Identities = 99/340 (29%), Positives = 164/340 (48%), Gaps = 66/340 (19%) 
^ ^ GGLPR RV E +GPESSGKTT AL ♦ AQ 

Sbjct: 67 GGLPRGRVTE I YG PESSGKTTLALHAIAEAQ . KNG 100 

Query: 104 AVKELEMQLDSU}EPLKIVYLDLENTLDTEWAKKI^ 163 

+ L +D E+ D +A+K+GVD++ + + +PE S E+ L V 

Sbjct: 101 GIAAL- VDAEHAFDPTYARKLGVDINALLVSQPE - - SGEQALSX VE 143 

Query: 164 DI FETGEVGLWIiDSLPYMVSQNLIDEELTKKAYAG I S A PLTEF S RKVT P LLTRYNA I FI» 223 

♦ +G V ++V+DS+ +V Q ♦+ E+ + RK+T +♦+ L 

Sbjct: 144 T L VR SG AVD 1 1 VI D S VAAL V PQAE LEG EMGD SWG LQARLMSQALRKLTG AI S KS S S VCL 203 

Query: 224 GINQI REDMNSQYNA - YST PGG KMWKHACAVRLKFRKGD YLDENGAS LTRTARNPAGNW 282 

INQ+R+ + Y + +T GGK K +VRL RK + ++G L GN 
Sbjct: 204 FINQIiRDKIGVWGSPETTTGGKAIjKFYSSVRIiDIRKIAQI-KDGEELV GNRT 255 

Query: 283 ESFVEKTKAFKPDRKLVSYTLSYHDGIQIENDLVDVAVEFGVIQKAGAWFSIVDLETGEI 342 

+ VKK PK + ♦ Y +GI + +L+D+AVEFG+I+K+GAWFS + G 
Sbjct: 256 KVTCVVKNKV-APPFKTAEFDILYGEGISVLGELIDIATO 312 

Query: 343 MTDEDEEPLKFQGKANLVRRFKEDDYLFDMVMTAVHEIIT 382 

QG+ N+ KED+ L + ♦ V +++T 
Sbjct: 313 QGRENVKKLLKED ETLRNTI RQQVRDMLT 341 

Queryo sid| 114832 | lan |dplORF011 Phage dpi ORF| 28017-29096 | 3 
(359 letters) 

>gi | 2444110 (UB8974) ORF31 (streptococcus thermophilus temperate bacteriophage 
01205) 

Length = 34 8 
Score = 187 bits (469) , Expect = le-46 

Identities « 118/358 (32%), Positives « 187/358 (51%), Gaps » 21/358 (5%) 

Query: 3 IYDYINAGEIASYIQ^PSNALQYLGPTLFPNAO^GTDISWLKGAmiLPVTIQPSNYDA 62 

I YD + A IA Y AL N LG ++FP +Q GT +S++KGA+ V ♦ +D 
Sbjct: 4 I YDKVTASNI AGYFNALQENVSSTLGES I FPARKQLGTKLS YI KGASGQS VALXAAAFDT 63 

Query: 63 KASLRERAGFSKQATEMAFFRESMRLGEKDRQNLQMLLNQSSA-LAQPLITQLYNDTKNL 121 
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♦+R+R +M FF+E+M + E DRQ L +♦ 4 +A L ♦+ ++ND L 

Sbjct: 64 NVTIRDRVSAEMHDEQMPFFKEAMLVKENDRQOI^VKDSGNAVLVNTIVAGIFNDNLTL 123 

Query: 122 VDGVEAQ AE YMRMQ LLQYG KFTVKS TNS EAQYTYD YNMD AKQQ YAVTKKWTN PAE SO P I A 181 

V+G A+ B MRMQ+L GK S Y D K+Q V+K W P ♦ P+A 

Sbjct: 124 VNGARARLEAMRMQ VLATGKI A FTSDG VNKD I D YG VKPDHKKQ - - VSKS WAEPG - ATPLA 180 

Query: 182 DI LAAMDD I ENRTGVRPTRMVLNRNTYNQMTKSDS I KKAL - AIGVQGS WENF LLLASDAE 240 

D+ A+ + G+ P R V+N T+ + K+ S K + + GS + ++ E 

Sbjct: 181 DLEDAI - ETARELGLNPERAVMNAKTFGLIRKAASTVKVIKPLAGDGS AVTKAELE 235 

Query: 241 KFIAEKTGLQIAVYSKKIAQFADADKLPDVGNIRQFNLIDDGKVVLXPPDAVGHTWYGTT 300 

+IA+ G+ I ♦ + D G + +F DG + L+P +G+T +GTT 

Sbjct: 236 NYIADNFGVS I VLENGTYRN DKGEVSKF- - YPDGHLTLI PNGPLGNTVFGTT 285 

Query: 301 PEAFDLASGGT - D AQ VQ VLSGG P TVTTYLE KH P VN I ATWS A VM I P S FEG I D YVGVLT 357 

PE DL + T +A+V+++ G VTT PVK+ T VS V +PSFE +D V +LT 

Sbjct: 286 PEESDLFADNTVNAEVEIVDNGIAVTTTKTTDPVNVQTKVSMVA^ 343 



Query- aid | 114834 | lan |dplORF013 Phage dpi ORF| 10215-11240 | 3 
(341 letters) 

>9p|P09122|DP3X_BACSU DNA POLYMERASE III SUBUNITS GAMMA AND TAU 
Length - 563 

Score = 182 bits (458) , Expect = 2e-45 

Identities » 118/353 (33%), Positives = 176/353 (49%), Gaps = 31/353 (8%) 

Query: 7 YRPG^FEEWAQEYVKEILLNQLQNGAIKHGVLFCXXXXXXXXXXXRIFAKDVN- 60 

+RPQ FE+W QE++ + L N L H YLF +IFAK VN 

Sbjct: 10 FRPQRFEDVVGQEHITKTLQNALIX3KKFSHAYLFSGPRGTGKTSAAKIFAJ^ 69 

Query: 61 KGL GSPIEIDAASNNGVENVRNIIEDSRYKSMDSEFKVYTIDEVH 105 

KG+ IEIDAASNNGV+ +R+I ++ +KVYIIDEVH 

Sbjct: 70 DE PCNECAACKG I TNG S I SD VI E I DAASNNGVDE I RDIRDKVKFAPSAVTYKVYI IDEVH 129 

Query: 106 MLSTGAFNALIjKTLEE PS SGTVF I LCTTD PQKI PDTI LSRVQRFDFTRI DNDD I VNQLQF 165 

MLS GAFNALLKTLEEP +FIL TT+P KIP TI+SR QRFDF RI + IV ++ 
Sbjct: 130 MLSIGAFNAUJCTLEEPPEHCIFILATTEPHKIPLTIISRCQRF^ 189 

Query: 166 IIESENEEGAGYSYERDALSFIGKLANGGMRDSITRLEKVLDYSHHVDMBAVSNAL---G 222 

I+++B E +L I A+GGMRD+++ L++ + +S D+ V +AL G 

Sbjct: 190 IVDAEQ LQVEEG SLEI I AS AAHGGMRD ALS LLDQAI S F SG - - D I LKVED ALLI TG 242 

Query: 223 VPDYETFASLVEAIANYTCSKCLEIVNDFHYSGra 282 

L +++ ♦ + S LE +N+ GKD + + 4- ++ Y + 
Sbjct: 243 AVSQLYIGKLAKSLHDKNVSDALETLNELIXX& 302 

Query: 283 ITQLPAHFESKLEQFCEAFQYPTLLWMLEEMNELAGWKWEPNAKPIIETKLL 335 

•f + E L M++ +N+ +KW + + E ++ 

Sbjct: 303 GVLEKVKVDETFRELS EQI PAQALYEMIDI LNKSHQEMKWTNHPRI FFEVAW 355 



Query- sid| 11483S | lan|dplORF014 Phage dpi ORF| 50961-51974 | 3 
(337 letters) 

>sp|P47492|PRIM_MYCGE DNA PRIMASE >gi | 1361496 \ pir| | F64227 DNA primase (dnaE) homolog 
MG250 - Mycoplasma genitalium (SGC3) >gi| 3844848 
(U39704) DNA primase (dnaE) [Mycoplasma genitalium) 
Length a 607 

Score = 57.0 bits (135), Expect = 2e-07 

Identities - 53/190 (27%) , Positives n 89/190 (45%) , Gaps = 17/190 (8%) 

Query: 146 EELDKYRFIHP YMYERKLTDELIEMFDVGYDK--LHDCITFPVRNLKGETVFF 196 

E +++Y FI+P Y++ K ++FD K +IP++GVF 

Sbjct: 170 ESMERYPFINPKIKPSELYLFS - KTNQQGLGFFDFNTKKATFQNQIMIPIHDFNGNPVGF 228 

Query: 197 NRRS VRSKFHQYGEDDPKTEFLYGQYELVAFRDYFEKPI SQVFVTES VI NCLTLWSMKI P 256 - "1 

+ RSV + ++ EF + + EL+ K ++Q+F+ E ■*■ TL + K ^ 

Sbjct: 229 SARS VDNI NKLKYKNSADHE F - FKKGELLFNFHRLNKNLNQLF I VEG YFD VFTLTNS KFE 287 

Query: 257 AVALMGVGGGN-QINLLKR- - LPYRNIVLALDPDNAGG/TAQEKLYRQLKRSK-VVRFLNY 312 

AVALMG+ ♦ QI +K + +VLALD D +GQ A L +L ♦ +V + + 

Sbjct: 288 AVALMGLALNDVQ I KAI KAHFKE LQTLVLALDNDA5GQNAVFS LI EKLNNNNF I VE I VQW 347 



WO 00/32825 



PCT/IB99/02040 



405 



Query: 313 PKEFYDNKWD 322 

+ D WD 
Sbjct: 348 EHNYKD--WD 355 



Query- aid | 114837 | lan | dplORF016 Phage dpi ORF| 43413-44303 |3 
(296 letters) 

>emb|CAB07986 | (Z93946) N-acetylrouramoyl-L- alanine amidase [bacteriophage Dp-1] 
Length » 296 

Score = 661 bits (1686), Expect » 0.0 

Identities » 296/296 (100%), Positives * 296/296 (100%) 

Query: 1 I^VDIEKGVAWMQARKGRVSYSMDFRDGPDSYDCSSS^^^AI^SAGASSAGWAVNTBY^ 60 

MGVD I EKG V AWMQARKGR VS Y S MD FRDG PDS YDCS S SMYYALRS AGAS S AGWA VNTEYMH 
Sbjct: 1 KGVDI EKG VAWMQ ARKGRVS YSMD FRDG PDS YD CS S SMYYALRSAGAS S AG W A VNTEYMH 60 

Query: 61 AWLIENGYELI SENAPWDAKRGDI FI WGRKGASAGAGGHTGMFIDSDNI IHCNYAYDGI S 120 

AWLI ENG YELI S ENAPWDAKRGD I F I WGRKGAS AGAGGHTGMF I DSDN 1 1 HCNYAYDGI S 
Sbjct: 61 AWLIENGYELISENAPWDAKRGDIFIWGRKGASAGAGGHTGMFIDSDNI IHCNYAYDGI S 120 

Query: 121 VNDHDERVT^AGQPYYYVYRLTNANAQPAEKKLGWQKDATGFWARANCOT 180 

VNDHDERVTXTAGQPYYYVYRLTNANAQPAEKKLGWQKDATGFWYARANGT^ E 
Sbjct: 121 VNDHDERWYYAGQPYYYVYRLTNANAQPAEKKLGWQKDATGFWY E 180 

Query: 181 F^KSWFYTDDQGYMLAEKWIJCHTDGNWYWFDRDGYMATSWKRIGE 240 

ENKSWFYFDDQGYMLAEKWLKHTDGNWYW 
Sbjct: 181 F^KSWF^FDDCCYMLAEKWLKHTDGNWYWFDRDGYMATSWKRIG 240 

Query: 241 IKYYDNVTyYCDATNGDMKSNAFIRYNDGWYLLLPDGRIJU)KPQFTVEPDGLIT 296 

I KYYDNWYYCDATNGDMKSNAFIRYNDGVm^LPDGRLADKPQFT^ 
Sbjct: 241 I KYYDNWYYCD ATNGDMKSNAF I R YNDG WYLLLPDGRLADKPQPTVE P DG L I TAKV 296 

Query* sid| 114841 | lan |dplORF020 Phage dpi ORF| 1864-2658 | 1 
(264 letters) 

>emb| CAB1324 7 | (Z99111) similar to coenzyme PQQ synthesis (Bacillus subtil is] 
Length » 243 

Score = 217 bits (548) , Expect = 5e-56 

Identities - 117/248 (47%), Positives - 163/248 (65%), Gaps = 15/248 (6%) 

Query: 23 MPIMEIFGPTIQGEGMVIGQKTIFIRTGGCDYHCNWCDSAFTWMGTTEPE - - YITGKEAA 80 

+P++EIFGPTIQGEGMVIGQKT+F+RT GCDY C+WCDSAFTW+G+ + + ++T +E 
Sbjct: 5 IPVXEirePTIC^EGMVIGQKTMFTOTAGCDYSCSWCDSAFTWDGSAKKDIRVWTAEEIF 64 

Query: 81 SRI LKLAFNDKGEQI CNHVTLTGGNPALINEPMAKMIS I LKEHGFKFGLETQGTRFQEWF 140 

+ + D G +HVT++GGNPAL+ + + I +LKE+ + LETQGT +Q+WF 

Sbjct: 65 AEL KDIGGDAFSHVTISGGNPALLKQ-U3AFIELLKENNIRAALF/TGCTVYQDWF 118 

Query: 141 KEVS D I T I S P K P PS SGMRTNMKI LEAI VDRM - - NDENLDWS F KI VI FDENDLAY ARDMF K 198 

+ D+TISPKPPSS MTN+L+I++ND S K+VIF++ DL +A+ ♦ K 

Sbjct: 119 TLIDDLTI S PKPPSSKMVTNFQKLDHI LTSLQENDRQHAVSLKVVI FNDEDLEFAKTVHK 178 

Query: 199 TFEGKLRPVNYLSVGNANAY - - EEGKISDRUjEKLGWLWDKVYEDPAFNNVRPLPQLHTL 256 

+ G YL VGN + + + LL K L DKV D N VR LPQLHTL 

Sbjct: 179 RYPG 1 PFYLQVGNDDVHTTDDQSL I AHLIX3KYEyVLVDKVAVDAELNLVRVLPQLHTL 235 



Query: 257 VYDNKRGV 264 

++ NKRGV 
Sbjct: 236 LWGNKRGV 243 
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Query* sid| 114842 | lan|dplORF021 Phage dpi ORF| 2504-3295|2 
(263 letters) 

>sp|P19465|GCHl_BACSU GTP CYCLOHYDROLASE I (GTP-CH-I) >gi | 98411 |pir| |A38256 GTP 
cyclohydrolaae 1 (EC 3.5.4.16) - Bacillus subtilis 
>gi | 143231 (M37320) regulatory protein (Bacillus 
subtilis] >gi | 143799 (M80245) MtrA {Bacillus subtilis] 
>gi|2634696|emb|CAB14194| (Z9911S) GTP cyclohydrolaae I 
(Bacillus subtilis] 
Length = 190 

Score « 208 bits (523), Expect = 4e-53 

Identities * 103/185 (55ft), Positives » 133/1BS (71%), Gaps » 1/185 (0ft) 

Query: 80 VTLDjn^AAVQRLFGLLGEDAERDGLQDTPFRFVTCAIAEH™ 139 

V + B AV+++ +GED R+GL DTP R K AE G EDPK H + F +H 
Sbjct: 4 VNKEQIEQAVRQILEAIGEDPNREGLIJ3TPKRVAKMYAEVFSGIJ^DPKEHFQTircENH 63 

Query: 140 EDLVLVKD I PFNS LCEHHLAPFVGKVHI AYI PKD - KITGLSKFGRVVEG YAKRI<}VQERL 198 

E+LVLVKDI F+S+CEHHL PF GK H+AYIP+ K+TGLSK R VE AKR Q+QER+ 
Sbjct: 64 E ELVLVKD IAFHS MCEHHLVP FYGKAHVAYI PRGGKVTGLS KIARAVEAVAKRPQLQERI 123 

Query: 199 TQQ I ADAI QEVLN PQAVAVI VEAEHTCMSGRG I KKHGATTVTSTMRGLFQDDASARAELL 258 

T IA++I B L+P V V+VEAEH CM+ RG++K GA TVTS +RG+F+DDA+ARAE+L 
Sbjct: 124 TSTIAES IVETLDPHGVMVVVEAEHMOrTMRGVRKPGAKTVTSATO 183 

Query: 259 QLIKK 263 

+ IK+ 
Sbjct: 184 EHIKR 188 



Query= sid| 114843 | lan |dplORF022 Phage dpi ORF| 30896-31675 | 2 
(259 letters) 

>gi | 2347102 (U77367) internalin [Listeria monocytogenes] 
Length » 821 

Score = 55.0 bits (130), Expect = 5e-07 

Identities = 44/149 (29%), Positives » 63/149 (41%), Gaps = 13/149 (8%) 

Query: 119 FRMNIYVPNYVG- -DSIVNYVKITLNNCTGKAPGLSIGKEFYAPEFNIKAREATKAGLPV 176 

F + VPN + D + + NNTAPL YPE+K + K + 

Sbjct: 383 FS KTLS VPNN ITS IDGTLI APETI SNNGTYDAPNLKWSLPNYLPE - - VKYTFSQKI PIGT 440 

Query: 177 KSMDYVAQLPAVLR RVTFDLNGGTGTADAVRVEAGKKISPKPVDPTLTGKAFKGW 231 

+ +Y + L+ +VTF++ G T + V E ♦ P+P PT G F GW 
Sbjct: 441 GTSNYSGF ITQPIJCELLDYKVTFNVlSGNTSEVETvTEE NLI PEPTS PTKQG YTFDGW 497 

Query: 232 - KVEGESTIWDFDNHMMPDRDVKLVAQFA 259 

E T WDF MP D+ L A F+ 
Sbjct: 498 YDAETGGTKWD FTTG QM P ANDLTL YAH F S 526 



Score = 43.4 bits (100), Expect « 0.002 

Identities = 47/195 (24%), Positives = 73/195 (37%), Gaps = 12/195 (6%) 

Query: 72 YDLTFKDNTFDPEIMALIEGGTVRQQGGTIAGYDT- PMLAQGASNMKPFRMNI YVPNY- - 128 

YD +T+ +G ♦ GG +TMA+ F+NYN+ 

Sbjct: 547 YDALLNEPTT PTKQG YTnX^fnDAETGGNKWDFKTMKMPANDVAFY 606 

Query: 129 VGDSI VNYVKITLNNCTGKAPGLS IGKEFYAPEFNI KAREATKAGLPVKSMDYVAQL 185 

V++Y+ T G + ♦ A K TK+P + A 

Sbjct: 607 DGEVKNETIAYDTLLNEPTTPTKQGYTFDGWYDAETGGTKTO 665 

Query: 186 PAVXRRVTFDLNGGTGTADAVRVEAGKKI S PKP VDPTLTGKAFKGW- KVEGE ST I WD FDN 244 

+ FD++G T + V +A + P+P P+ TG +GW E T WDF 
Sbjct: 666 TINNYQANFDIDGAV - TEEWNYDA LI PEPTS PSKTGFTLEGWYDAEVGGTKWDFKT 721 

Query: 245 HMMPDRDVKLVAQ FA 259 

MP D+ L A F + 
Sbjct: 722 MKMPANDI TLYAHFS 736 



Score =38.3 bits (87), Expect - 0.057 

Identities * 42/169 (24%) , Positives « 59/169 (34%) , Gaps » 10/169 (5%) 
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Query: 96 QQGGTIAGYOT- PMIAQGASNMKPFRMNIYVPim^GDSIVNTVKIT LNNCTGKAPG ISO 

+ GGT +TMA + F+NVN+D+V ♦ LNT 
Sbjct: 501 ETGGTKWDFTTGQMPANDLTXYAHFS WSYQANFD I DGWTNEAWYDALLNE PTTPTKQ 560 

Query: 151 LS XGKE F YAPEFNI KAREATKAGLPVKSMDYVAQLPAVIjRR VTFDI2IGGTGTADAWVEA 210 

+Y E + +P + + A + FD++G A 
Sbjct: 561 GYTFIXSWYDAETGGNKWDFKTMKMPANDV^ A 616 

Query: 211 GKKISPKPVDPTLTGKAFKGW * KVEGESTIWDFDNHMMPDRDVKLVAQF 258 

+ +PPTGFGM E T WDF MP DV LAP 
Sbjct: 617 YDTIJire PTTPTKQGYTF1X3WYDAETG 665 



Query* aid 1 114850 1 Ian | dplORF029 Phage dpi ORF | 662-1348 1 2 
{228 letters) 

>gi| 2650185 (AE001074) euccinoglycan biosynthesis regulator (exsB) 
[Archaeoglobus fulgidus] 
Length » 239 

Score * 119 bits (295), Expect = 2e-26 

Identities = 79/224 (35ft), Positives « 113/224 (50ft), Gaps = 11/224 (4ft) 

Query: 1 MKSVVXLSGGVT)SATCLAIEVT)KWGSKNV^ 60 

MK+V+LLSGG+DS+T L +D G VHA+ F YGQKH E+E+A VA V+ 
Sbjct: 1 MKAVMLLSGGIDSSTLLYYLLD- - GG YEVHALT F FYGQKHS KE I E SAEKVAKAAKVRHLK 58 

Query: 61 LEIDSKJYXXXXXXIJjQGKGEISHGKSYAEILAEKEVVD^ 120 

++I S 1+ L G+ E+ Y+E + + T VP RN ++LS 
Sbjct: 59 VD I - STIHDLI SYGALTGEEEVPKA - FYS EEVQRR TIVPNRNMILLS - - IAAGYAV 110 

Query: 121 XXXXXXXXXXXXXXXXXXXPDCTPEFYNSMSNAMEYGT - GGKVTLVAPLLTLTKAQWKW 179 

PDC EF ++ A+ V + AP + +TKA +V+ 

Sbjct: 111 KIGAKEVTIYAAHLSDYSIYPDCRKEFVKAIiDTAvT 170 

Query: 180 G IDLDVPYFLTRSCYESDAESCGTC ATCI DRKKAFEENGMTDP I 223 

G+ L VPY LT SCYE C +C TC++R +AF NG+ DP+ 

Sbjct: 171 GLKLGVTYELTWSCYEGGDRPCLSCX5TCLERTEAFLANGVKDPL 214 



Query= sid| 114855 1 lan |dplORF034 Phage dpi ORF 1 131-652 1 2 
(173 letters) 

>emb|CAB1324 8| (Z99111) similar to hypothetical proteins (Bacillus subtilisl 
Length =165 



Score « 220 bits (556) , Expect « 4e-57 

Identities =» 103/139 (74ft) , Positives = 117/139 (84%) 



Query: 


5 


TTRTDAELTGvTLLGNQDTKYDYDYNPDVLETFPNKH P ENNYLVTFDGYBFTS LCPKTGQ 
TTR ++EL GVTLLGNQ T Y ++Y PDVLE+FPNKH +Y V F+ EFTS LCPKTGQ 


64 


Sbjct: 


2 


TTRKESEI*EGVTLLGNQGTNYLFEYAPDV1^SFPNKHWRDYFVKFNCPEFT 


61 


Query: 


65 


PDFANVFISYIPNEKMVESKSLKLYLFSFRNHGDFHEDCMNIIIJTDLYEIJ^EPKTIEV^ 


124 




PDFA + + 1 S YI P + EKMVES KSLKLYLF S FRNHGDFHEDCMN 1 1 +ND L ELM+P+YIEV G 




Sbjct: 


62 


PDFATIYISYIPDEKMVESKSIJCLYIiFSFRNHGDFHEIXSWIIMNDLIEI^^ 121 


Query: 


125 


LFTPRGGISIYPFVNKVNP 143 
FTPRGGISI P+ N P 




Sbjct: 


122 


KFTPRGGISIDPYTNYGKP 140 





Query- sid| 114857 1 lan|dplORF036 Phage dpi ORF| 48808-49362 1 1 
(184 letters) 

>gi | 135352 9 (U38906) ORF12 [Bacteriophage rlt] 
Length =296 

Score a 53.5 bits (126), Expect » le-06 

Identities « 42/149 (28%), Positives * 70/149 (46%), Gaps « 9/149 (6%) 

Query: 34 IAS^^^VGNGICTSWA VTUXQRYI*AETAIJDGR I VEKGMF W 93 

+ S G GK+ A+ +L+ LTL ++ V + F ♦ + P ♦ + F+ 

Sbjct: 155 WSGPAGTGKSHLAMSILKDCLQHTDLT- -VIFASWSEVLHLIKDSFDNKDSFYSTEYFM 212 
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Query: 94 ERFERLKTCEIXVIDEIGGGSLTKASYPYLYDLVNYR^ 153 

E F + +LLVID+IG +T+ S L ++♦+ R TI TTN DEI 

Sbjct: 213 EVF- - -RNTDLLVIDDIGSEKITEWSMSLLTEVLDART KTIITTNLKSDEIRKKYH 265 

Query: 154 QRLYSRIYDTSVVLDFQASNVRGLEVSEI 182 

R YSR++ F N++ VS++ 

Sbjct: 266 NRTYS RLFRG I GKKAFNFEN I KDKRVSQL 294 

Query* aid] 114859 | lan|dplORF038 Phage dpi ORF| 1350-1871 | 3 
(173 letters) 

>sp|P44123|YB90_HAEIN HYPOTHETICAL PROTEIN HI1190 >gi | 1074675 | pir| | F64021 hypothetical 
protein HI 11 90 - Haemophilus influenzae (strain Rd KW20) 
>gi | 1574117 (U32798J 6-pyruvoyl tetrahydrobiopterin 
synthase, putative (Haemophilus influenzae Rd] 
Length - 141 

Score » 100 bits (247) , Expect « 6e-21 

Identities = 59/143 (41ft), Positives « 83/143 (57%), Gaps = 10/143 (6%) 

Query: 2 R VSKTLT FDAAHQLVGH rcKCANLHGHTYKVE I SLAGGTYDHGSSQGMWDFYHVKKIA- 60 

++SK +FD AH L GH GKC NLHGHTYK+++ ++G Y G+ + MV+DF +K I 
Sbjct: 3 KISKEFSFDMAHLLDGHDGKCQNLHGHITKLC^ 62 

Query: 61 GTFIDRLDHAVLL - QGNEP 1 ALANAVDTKRVLFG FRTTAENMSRFLTWTLTBLMWK 115 

+D +DHA + Q NB L +++K FRTTAE ++RF+ L + 

Sbjct: 63 KVT LDPMDHAFI YDQTNERESQ I ATLLQKLNS KTFGVP FRTTAEE I ARF I FNRLKH - - DE 120 

Query: 116 HARIDS I KLWET PTGCAECTYYE 138 

I SI+LWETPT + C Y E 
Sbjct: 121 QLSISSIRLWETPT- -SFCEYQE 141 



Query= sid | 114860 | lan |dplORF03 9 Phage dpi ORF| 3306-3803 | 3 
(165 letters) 

>emb|CAA68244| (X99978) ORF7; hydophobic protein [Lactobacillus plantarumj 
Length =168 

Score =64.4 bits (154), Expect = 5e-10 

Identities = 49/156 (31%), Positives = 84/156 (53%), Gaps = 9/156 (5%) 

Query: 8 WLVRTALIAALYVTLTVAFSAI S Y - - G P I QFRVS EALI LLP LWNHRWTPG I VLGTI IANF 65 

W++ AL+AA+YV L + +A S G IQFRVSE L L ++N +♦ GIV G 1+ + 
Sbjct: 9 WIIN-ALVAAMYVVliCI/»PAAFSI»ASGAIQFRVSEGLNHIAvTNR 67 

Query: 66 FSP-LGLIDVLFGSLATFLGXXXXXXXXXXXSPLYSLICPVLA NAYLIALELRIVY 120 

F P L++VLFG ♦ L f+ + +A + ++IAL + ++ 

Sbjct: 68 FGPGASLLNVLFGGGQSLLAIJLiVLTWIA^ 127 

Query: 121 S-LPFWESVIYVGISEAIIVXISYFLISTLAKNNHF 15S 

S + FW + + +SE 11+ 1+ ++ +L + HF 
Sbjct: 128 SG VAFW PTYLTTALS ELI I MS I TAP IMYSLDRVLHF 163 



Query= sid | 114862 | lan |dplORF041 Phage dpi ORF| 8208-8699) 3 
(163 letters) 

>gi | 2522313 (AF012906) dUTPase homolog (Bacillus subtilis) 
>gi|2634394|emb|CAB13B93| (Z99114) similar to 
deoxyuridine 5 1 -triphosphate nucleotidohydrolase 
[Bacillus subtilis) >gi| 3025643 (AF020713) putative 
dUTPase [Bacteriophage SPBc2) 
Length « 142 

Score o 108 bits (267) , Expect o 2e-23 

Identities - 65/160 (40%), Positives = 83/160 (51%), Gaps « 25/160 (15%) 

Query: 5 VDVKMIDPKIXJRIJCrT- -GDWVDVRISSI 62 

+ +K +D R+ GDW+D+R + I D + 
Sbjct: 3 I KI KYLDETQTRINKMEQGDW I DLRAAEDVAI KKDEFKL 41 

Query: 63 KIAHGFALELPKGYEAIUiPRSSLFKKTGLIFVSS-GVIDEGYKGin'DEWFSVWYATRDA 121 

+ G A+ELP+GYEA + PRSS +K G+I +S GVIDE YKGD D WF YA RD 
Sbjct: 42 -VPI^VAMELPEGYEAHWPRSSTYKNFGVIQTNSMGVIDESYKGDNDFWFFPAYAUIDT 100 
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Query: 122 D I FYDQRI AQFR I QEKQPAI KFNFVES1XSNAARGGHGSTG 161 

I RI QFRI +K PA+ V+ LGN RGGHGSTG 
Sbjct: 101 KIKKGDRICQFRIMKKMPAVDLIEVDRLGNGDRGGHGSTG 140 

Query- sid| 114867 | lan |dplORF046 Phage dpi ORF| 42774-43202 | 3 
(142 letters) 

>emb|CAB07984| (293946) hypothetical protein (bacteriophage Dp-1) 
Length « 142 

Score » 287 bits (728), Expect » 2e-77 
Identities . 142/142 (100%), Positives - 142/142 (100%) 



, MPMWLKDTAVLTTI ITACSGVLTVLLNKLFEWKSN1CAKSVLEDI STTLSTLKQQVDGIDQ 60 

— " 5SBKSSSSSSSSSSSSSSSK " 

Query: 121 VEALYEKYKKLPIREEDLDETI 142 

VEALYEKYKKLP I REEDLDET I 
Sbjct: 121 VEALYEKYKKLPIREEDLDETI 142 

Query- sid| 114901 | lan |dplORF080 Phage dpi ORF| 42490-42759 |1 
(89 letters) 

>emb|CAB07983| (Z93946) hypothetical protein (bacteriophage Dp-U 
Length » 124 

Score - 147 bits (367), Expect =* le-35 

Identities » 75/75 (100%), Positives « 75/75 (100%) 



60 



Query: 61 EQKLRETRYAIEDEI 75 

EQKLRETRYAIEDEI 
Sbjct: 61 EQKLRETRYAIEDEI 75 

Query- sid| 114912 | lan |dplORF091 Phage dpi ORF|431B9-43413 |1 
(74 letters) 

>erab|CAB07985| (Z93946) holin (bacteriophage Dp-1) 
Length - 74 

Score « 63.2 bits (151), Expect = 2e-10 
Identities - 34/74 (45%) , Positives = 34/74 (45%) 

Query: 1 MKLSNEQYDXXXXXXX^^ 60 

MKT ^MEOYD xQFD j 

Sbjct: 1 MKLSNEQYDVAKNVVTVVVPAAI AL I T 60 

Query: 61 NYQKEQEAQNNEVE 74 

NYQKEQBAQNNEVE 
Sbjct: 61 NYQKEQEAQNNEVE 74 
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Condensed listing of homology inf ormation from above 

Phage: dpi 
Database: nr 
Program: Bla9tp 

Ouery= sid| 114822 | lan | dplORFOOl Phage dpi ORF | 36698-40390 1 2 
(1230 letters) 

gi|2444l24 (U88974) ORF45 (Streptococcus thermophilus temperate ... 467 e-130 

gi|928828 (L44593) ORF1904; putative (Lactococcus lactis phage B. . . 427 e-118 

gil2935676 (AF032121) unknown [Streptococcus thermophilus bacter. . . 309 le-82 

gi 2935691 (AF032122) unknown (Streptococcus thermophilus bacter .. . 306 7e-82 

qi|3540289 (AF0S7033) putative anti-receptor [Streptococcus ther... 279 6e-74 

qi 4S30154 )gb| AAD21894 . 1 | (AF085222) putative tail -host specif ic . 220 3e-S6 

qi 930045|emb|CAA33387| (X15332) alpha-1 (III) collagen (Homo sa. . . 58 4e-07 

gi 1070603 |pir||CGHU7L collagen alpha Kill) chain precursor - h... 58 4e-07 

qi|450295l|ref |NP 000081 . 1 | PCOL3A1 | collagen, type III. alpha 1 ... 58 4e-07 

gi 115290|sp|P04258|CA13 BOVIN COLLAGEN ALPHA l(III) CHAIN >gx|7... 58 4e-07 

gi 575322 |erab|CAA36279| 7x52046) type III collagen [Mus raus cuius) 57 8e-07 

qi|2119163|pirl IS59856 collagen alpha l(III) chain precursor - m. . . 57 8e-07 

gi 543912 | sp|P13941 | CA13_RAT COLLAGEN ALPHA l(III) CHAIN >gi | 543 .. . 57 le-06 

gi|3171998|embtCAA06510| (AJ005 3 95) collagen alpha 1 (III) (Ratt... 57 le-06 

gi|3947565|emb|CAA90250| (Z49967) similar to collagen; cDNA EST ... 54 7e-06 

gi|423403|pir| |A46053 bullous pemphigoid antigen, BPAG2 , type XV... 53 9e-06 

gi 115410 |sp|P12114 | CCS1_CAEEL CUTICLE COLLAGEN SQT-1 >gi | 84437 | .. . 53 9e-06 

gi|387380l|emb|CAA90084| (Z49907) cuticle collagen SQT-1; cDNA E. . . 53 9e-06 

Query* sid| 114823 | lan|dplORF002 Phage dpi ORF| 32386-35835 | 1 
(1149 letters) 

gi|3341922|dbj|BAA31888| (AB009866) orf 15 (bacteriophage phi PVL] 
gi (4126622 |dbj|BAA36642.l| (AB016282) ORF36 [bacteriophage phi -105) 
gi|l369948|emb|CAA59194| (X84706) host interacting protein (Bact... 
gi | 3139112 (AF063097) gpT (Bacteriophage P2} 
gi I 3337272 (U32222) G protein (Bacteriophage 186) 

gi|4063799|dbj|BAA36253| (AB008550) or£25; similar to T gene of . . . 
gi|3l72274 (AF022214) minor tail subunit; putative tape-measure ... 
gi 465127 |sp|Q0S233| VG26_BPMLS MINOR TAIL PROTEIN GP26 >gi|41904... 
gi|3540284 (AF057033) putative minor tail protein (Streptococcus... 
gi|2444119 (U88974) ORF40 [Streptococcus thermophilus temperate ... 
gi|2634555|emb|CAB14053| (Z99115) yoml (Bacillus subtilis) >gi|3... 
gi I 2392838 (AF011378) unknown [Bacteriophage ski) 
gi|2764 873|emb|CAA66557| (X97918) gene 18.1 (Bacteriophage SPP1) 
gi | 1353559 (U38906) ORF42 (Bacteriophage rlt) 

gi|63084l|pir| IS39079 puff C-8 protein - fungus gnat (Rhynchosci . . 
gi|l730865|sp|P5173l|YO27 BPHP1 HYPOTHETICAL 72.8 KD PROTEIN IN . . 
gi|224288|prf I (1101273J ORF 7 (Bacteriophage HP1) 

Query* sid| 114824 | lan |dplORF003 Phage dpi 0RF| 53538-55877 | 3 
(779 letters) 

118825 | sp | P00562 | DP01_ECOLI DNA POLYMERASE I (POL I) >gi|6705... 193 3e-48 

2982102|pdb|lKFS|A Chain A. All-Oxygen Dna Complexed To The 3 . . . 193 3e-48 

229889|pdb|lDPI| DNA Polymerase I (Klenow Fragment) (E.C.2 193 3e-48 

1169402 | spl P43741 |DP01 HAEIN DNA POLYMERASE I (POL I) >gi|l07... 191 le-47 

2688462 (AE001156) DNA~~polymerase I (polA) [Borrelia burgdorf . . . 190 3e-47 

809180|pdbjlKLN|A Escherichia coli «° 3e-47 

1913934|emb|CAA72997| (Y12328) DNA- directed DNA polymerase I ... 189 8e-47 

4090935 (AF028719) DNA polymerase type I [Rhodothermus sp. 'I... 175 le-42 

473157l|gb|AAD28505.l|AF121780 1 (AF121780) DNA polymerase I ... 174 2e-42 

1633576 (U57757) similar to proofreading 3' -5' exonuclease an... 173 4e-42 

3322368 (AE001195) DNA polymerase I (polA) (Treponema pallidum) 172 9e-42 

1006595|dbj|BAA10748| (D64005) DNA polymerase I (SynechocyBti. . . 171 2e-41 

585062 | sp| Q07700 | DP01_MYCTU DNA POLYMERASE I (POL I) >gi|4161... 163 5e-39 

4376908|gb|AAD1875l| (AE00164 5) DNA Polymerase I [Chlamydia p. . . 157 2e-37 

1169403|sp|P46835|DPOl MYCLE DNA POLYMERASE I (POL I) >gi"|l07... 152 7e-36 

2145839|pir| |S72949 DNA polymerase I - Mycobacterium leprae >... 152 7e-36 

1405438|erab|CAA67184| (X98575) DNA -dependent DNA polymerase (... 152 9e-36 

2506365|sp|P80194|DPOl_THECA DNA POLYMERASE I. THERMOSTABLE (... 147 2e-34 

3328929 (AE001322) DNA Polymerase I (Chlamydia trachomatis) 147 3e-34 



280 


3e- 


74 


232 


le- 


S9 


201 


3e- 


50 


188 


2e-46 


161 


3e-38 


159 


8e-38 


123 


6e-27 


108 


2e- 


22 


99 


2e- 


19 


90 


6e- 


-17 


66 


le- 


•09 


64 


5e- 


-09 


62 


3e- 


-08 


61 


6e 


-08 


55 


2e- 


-06 


S3 


8e 


■06 


S3 


le-05 
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3913510| 3p|052225|DP01_THEFI DNA POLYMERASE I. THERMOSTABLE {... 
1205984 (U33536) DNA polymerase I [Bacillus stearothermophilus) 
118827 |sp|P13252|DP01_STRPN DNA POLYMERASE I (POL I) >gi|9802... 
1942202 |pdb|lJXE| Stoffel Fragmenc Of Taq Dna Polymerase I 
1943520 |pdb|lKTQ| Dna Polymerase 

1084022 |pir| | JX0359 DNA-directed DNA polymerase (EC 2.7.7.7) ... 
507891 |dbj |BAA06775| (D32013) DNA Polymerase {Thermus aquaticus] 
118828 | sp| P19821 |DP01_THEAQ DNA POLYMERASE I, THERMOSTABLE (T. . . 
1706502 |sp|P52028|DP01jrHETH DNA POLYMERASE I, THERMOSTABLE (... 
1097211 jprf | (2113329A DNA polymerase [Thermus aquaticus therm... 
2098289 |pdbjlTAU| A Chain A, Structure Of Dna Polymerase 



gi| 1934761 |emb|CAB07981| (Z93946) hypothetical protein (bacterio. 
gi|3540290 (AF057033) putative minor structural protein (Strepto. 
gi | 2444125 (U88974) ORF46 [Streptococcus thermophilus temperate .. 
gi| 1934762 |emb|CAB07982| (Z93946) hypothetical protein (bacterio., 
gi|4530155|gb|AAD21895.1| (AF085222) unknown [Streptococcus ther. . 
gi|2935677 (AF032121) unknown (Streptococcus thermophilus bacter. . 
gi | 2935692 (AF032122) unknown (Streptococcus thermophilus bacter.. 
gi | 1136289 (U42597) histidine kinase A (Dictyostelium discoideum) 

Query* sid| 114B27| lan|dplORF006 Phage dpi ORF| 45296-46987 | 2 
(563 letters) 

gi { 4377165 |gb| AAD18987 | (AE001666) SWI/SNF family helicase_2 (Ch. . 
gij 1769947 |emb|CAA67095| (X98455) SNF (Bacillus cereus) 
gi|3329163 (AE001341) SWF/SNF family helicase (Chlamydia trachom.. 
gi| 4377149 |gb|AAD18973| (AE001664) SWI/SNF family helicase_l [Ch. . 
gi (3328995 (AE001326) SWI/SNF family helicase [Chlamydia trachom., 
gi|24 93354|sp|P75093|Y018_MYCPN HYPOTHETICAL HELICASE MG018/MG01.. 
gi | 1653748 jdbj |BAA18659| (D90916) helicase of the snf2/rad54 fam. . 
gi (1763712 j erabj CAB05939 j (Z83337) member of the SNF2 helicase fa.. 
gi|2636153lemb|CAB15645.l| (Z99122) similar to SNF2 helicase (Ba. . 
gij 2909552 |erab|CAA17284| (AL021924) helZ [Mycobacterium tubercul . . 
gi | 3844627 (U39681) ATP-dependent RNA helicase, putative [Mycopl. . 
gij 1351463 |sp|P47264|Y018_MYCGE HYPOTHETICAL HELICASE MG018 
gi|2660669 (AC002342) human Mi-2 autoantigen-like protein (Arabi.. 
gi|l361537|pir| |I64201 helicase (motl) homolog - Mycoplasma geni.. 
gij 34 82977 |emb|CAA20533.l| (AL031369) putative protein [Arabidop. . 
gi j 3298562 (U91543) zinc- finger helicase (Homo sapiens) 
gi j 3875971 |emb| CAB024 91 | (Z80344) similar to helicase; cDNA EST 
gij 4557451 jref |NP_001263 . 1 | PCHD3 | chromodomain helicase DNA bind., 
gi | 2645435 (AF007780) CHD3 [Drosophila melanogaster J 
gi|3875165|emb|CAA91798| (Z67881) Similarity to Mouse Chromodoma. . 

Query* sid( 114828 ( lan | dplORF007 Phage dpi ORF| 22230-23621 | 3 
(463 letters) 

gi | 2444105 (U88974) ORF26 (Streptococcus thermophilus temperate .. 
gi | 3318666 (U19754) BBA31 homolog [Borrelia burgdorferi] 
gi|2690260 (AE000790) conserved hypothetical protein (Borrelia b. . 

Query= sid| 114829 j lan |dplORF00 8 Phage dpi ORF| 49624-50961 | 1 
(445 letters) 

gi|4406210|gb|AAD19901| (AF100420) DnaB replication fork helicas.. 
gi|3121983|sp|025916|DNAB_HELPY REPLICATIVE DNA HELICASE >gi|231.. 
gi j 4416322 |gb|AAD203 14 | (AF106032) replicative helicase; DnaB (B.. 
gi|4155895 (AE001551) REPLICATIVE DNA HELICASE [Helicobacter pyl . . 
gi|3322317 (AE001191) replicative DNA helicase (dnaB) [Treponema., 
gij 138031 |sp|P04530|VG41_BPT4 PRIMASE- HELICASE (PROTEIN GP41) >g.. 
gi j 2983861 (AE000742) replicative DNA helicase (Aquifex aeolicus] 

Query* sid| 114831 | lan |dplORF010 Phage dpi ORF| 8699-9859 | 2 
(366 letters) 



9i 

gi 
gi 
gi 
gi 
gi 
gi 



146 
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34 


146 


7e- 


34 
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9e- 


34 
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33 
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le- 


33 


145 


le- 


33 
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le- 


33 


144 


2e- 


33 


144 


2e- 


33 


143 


3e- 


33 



1011 
346 
339 
300 
276 
250 
250 
50 



171 
160 
159 
157 
153 
146 
143 
143 
143 
140 
136 
136 
131 
129 
128 
120 
120 
120 
118 
118 



89 
59 
56 



66 
67 
65 
60 
58 
53 
51 



0.0 

2e-94 

3e-92 

2e-80 

4e-73 

3e-65 

3e-65 

7e-05 



le-41 
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7e-17 
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4e-08 
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3e-06 
le-05 



2760912 (AF037258) RecA protein (Chlorobium tepidum] 133 2e-30 

321985l|sp|P94666|RECA_CLOPE RECA PROTEIN >gi|l698591 (061497... 129 3e-29 

1350566 j sp j P48295 (RECA^STRVL RECA PROTEIN >gi| 508860 (U04837) . . . 128 7e-29 

744163 |prf j | 2014250A recA-like protein (Streptomyces violaceusj 126 3e-28 

730487|sp|P41054|RECA_STRAM RECA PROTEIN >gi ( 511133 | emb | CAA82 . 125 4e-28 

2687334 | emb|CAA15875 |**(AL020956) RecA protein [Streptomyces c .. . 125 6e-28 

1350565 | sp| P48294 | RECA STRLI RECA PROTEIN >gi | 481482 | pirj | S38 . 125 6e-28 
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464599 | sp| P33542 | RECA AQUPY RECA PROTEIN >gi 1 1086167 | pir| | ASS . . 
417636 |sp|P32725|RECA>HOSH RECA PROTEIN >gi | 541307 | pir | j S415 . . 
2984348 (AE000775) recombination protein RecA [Aquifex aeolicus] 
3219854 | 9p| P9584 6 | RECA_STRRM RECA PROTEIN >gi | 1729800 | emb|CAA. . 
2500086 jsp|Q59560|RECA_MYCSM RECA PROTEIN >gi| 1430892 |emb|CAA. . 
1350567 |sp|P4 82 96 |RECA_THEAQ RECA PROTEIN >gi j 1072963 |pir| | A5 . . 
62S663jpirj | JX0292 recA protein - Thennus aquaticus (strain HB8 
1172B80|sp|P42440|RECA_CAMJE RECA PROTEIN >gi | 2119991 | pir | | 14 . . 
41546S4 (AE001453) RECA PROTEIN. (Helicobacter pylori J99J 
1072968 | pir | |C55020 recA protein - Thermus sp >gi | 4584 72 |dbj | . . 
3219852 |sp|P95469|RECA_PARDE RECA PROTEIN >gi|l825468 (U59631.. 
2507284 |sp|P42445|RECA_HELPY RECA PROTEIN >gi| 231323 S |gb|AAD0 . . 
1172890|sp|Q02350|RECA_STAAU RECA PROTEIN >gi|463285 (L25893) . . 
4416209 |gb|AAD2026l| (AF094756) RecA protein (Bifidobacterium.. 
2500084 | sp|Q59180 | RECA BORBU RECA PROTEIN >gi| 1276443 (U23457.. 



9i| 
9i| 

gi| 
gil 
gi| 
gil 
gil 
gil 
gi 
gi 
gi 
gil 
gil 
gil 
gil 

Query* sid| 114832 | lan |dplORF011 Phage dpi ORF|28017-29096 | 3 
(359 letters) 

gi (2444110 (U88974) ORF31 (Streptococcus thermophilus temperate . 
gi | 3320438 (AF057033) gp348 (Streptococcus thermophilus bacterio. 
gi|479514|pxr| |S34244 hypothetical protein p38 - actinophage VWB. 

Query- sid| 114834 | lan | dplORF013 Phage dpi ORF| 1021S-11240| 3 
(341 letters) 



gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 



S80855|emb|CAA299S8| (X06803) dnaZX-like ORF put. DNA polymer. 

118807) sp|P09122|DP3X BACSU DNA POLYMERASE III SUBUNITS GAMMA. 
9B292|pir| |S13786 DNA-directed DNA polymerase (EC 2.7.7.7) II. 
1527142 (U66040) DNA polymerase III gamma subunit (Salmonella. 
2494197|sp|P74876|DP3X SALTY DNA POLYMERASE III SUBUNITS GAMM. 

118808) sp|P06710|DP3X ECOLI DNA POLYMERASE III SUBUNITS GAMMA. 
4155207 (AE0014 97) DNA POLYMERASE III SUBUNITS GAMMA AND TAU . 
231384l|gb|AAD07767.l| (AE000584) DNA polymerase III gamma an. 
2583049 (AF025391) DNA polymerase III holoenzyme tau subunit . 
2984127 (AE000759) DNA polymerase III gamma subunit [Aquifex . 
3861390|erab|CAA15289| (AJ235273) DNA POLYMERASE III SUBUNITS . 
1169397|sp|P43746|DP3X_HAEIN DNA POLYMERASE III SUBUNITS GAMM. 
1293572 (U49738) DNA polymerase III tau homolog DnaX [Cauloba. 
3328753 (AE001306) DNA Pol III Gamma and Tau (Chlamydia trach. 
4376294 |gb|AAD18193 1 (AE001589) DNA Polymerase III Gamma and . 
581255|emb|CAA28175| (X04487) alternate dnaZX protein (AA 1-6. 
2688379 (AE001151) DNA polymerase III, subunits gamma and tau. 
3323329 (AE001268) DNA polymerase XXX, subunits gamma and tau. 



Query* sid) 114835 | lan |dplORF014 Phage dpi ORF| 50961-51974 | 3 
(337 letters) 

gi 1 1346796 | sp| P47492 ) PRIM_MYCGE DNA PRIMASE >gi| 1361496 |pir| | F64 . 
gi| 740008 |prf j (2004290A primase (Haemophilus influenzae) 
gi|ll72619|sp|Q08346|PRIM_HAEIN DNA PRIMASE >gi| 1074033 |pir| |A64 . 
gi|l709769|sp|Q04505|PRIM_LACLA DNA PRIMASE >gi j 1075726 | pir | | JC2 . 
gi 1 639846 1 dbj |BAA03516| (D14690) DNA primase (Lactococcus lactis] 
Query- sid|H4837 j lan|dplORF016 Phage dpi ORF| 43413-44303 1 3 
(296 letters) 

gi|l934766|emb|CAB07986| (293946) N-acetylmuramoyl-L- alanine ami. 
gi|ll3676|sp|P06653|ALYS_STRPN AUTOLYSIN (N-ACETYLMURAMOYL-L- ALA. 
gi|282326|pir| [A42935 N-acetylmuramoyl-L-alanine amidase (EC 3.5. 
gi| 416618 |sp|P32762|ALYS_BPHB3 LYTIC AMIDASE (N-ACETYLMURAMOYL-L. 
gi|285273|pir| (A42936 N-acetylmuramoyl-L- alanine amidase (EC 3.5. 
gi| 127787 |sp|P15057|LYCA_BPCPl LYSOZYME (ENDOLYSIN) (MURAMIDASE). 
gi 1 67761) pir | JMUBPCP N-acetylmuramoyl-L- alanine amidase (EC 3.5.. 
gi j 127789 1 Bp| P19386 | LYCA_BPCP9 LYSOZYME (ENDOLYSIN) (MURAMIDASE). 
gi| 928832 (L44S93) ORF259; putative (Lactococcus lactis phage BK. 
gi|25ll705|emb)CAA71783| (Y10818) sigA binding protein (Streptoc. 
gi | 4097980 (U72655) surface protein C (Streptococcus pneumoniae] 
gi | 2351768 (U89711) PspA (Streptococcus pneumoniae) 
gi 1 2425109 (AF019904) choline binding protein A (Streptococcus p. 
gi|282335|pir| (A41971 surface protein pspA precursor - Streptoco. 
gi| 2576331 |emb|CAA05158) (AJ002054) SpsA protein (Streptococcus . 
gi| 2127295 jpirj |S57962 cspC protein - Clostridium acetobutylicum. 
gi) 2576333 |emb|CAA0S159| (AJ002055) SpsA protein (Streptococcus . 
gi|4106522|gb|AAD02874.1| (AF097909) excreted protein FibB (Pept. 
gi|1361406 |pir| |S57714 cspB protein - Clostridium acetobutylicum. 
gi 1 1914872 |emb|CAB04758) (Z82001) PCPA (Streptococcus pneumoniae] 
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gi| 3168594 |dbj|BAA28613| (AB012763) SpaA (Erysipelothxix rhusiop. 
gi 22927S0|emb|CAA64 942| (X95646) homology to orf259 of lactococ. 
gi 1 2935696 (AF032122) putative lysin [Streptococcus thermophilus . 
gi|4586910|dbj|BAA76S40.l| (AB017447) protective antigen SpaA.l . 
gi|3S40294 (AF057033) lysin (Streptococcus thermophilus bacteno. 

Query= sid|114841|lan|dplORF020 Phage dpi 0RF| 1864-2658 | 1 
(264 letters) 



2633745|emb|CAB13247| (299111) similar to coenzyme PQQ synthe... 
2808502 |erab|CAA12532| (AJ225561) ExsD protein [Sinorhizobium . . . 
386115l|emb|CAA1505l| (AJ235272) unknown (Rickettsia prowazekn] 
1652793 dbj BAA17712 j (D90908) hypothetical protein [Synechoc... 
1723815|sp|P55139|YGCF_ECOLI HYPOTHETICAL 25.0 KD PROTEIN IN . . . 
2984272 (AE000769) hypothetical protein (Aquifex aeolicus] 
4155435 (AE001516) putative [Helicobacter pylori J99) 
2127833|pir| |C64505 coenzyme PQQ synthesis protein III homolo. . . 
2622338 (AE000890) coenzyme PQQ synthesis protein III (Methan. . . 
32S7042|dbj|BAA29725| (AP000003) 2S4aa long hypothetical prot... 
2314068 |gb|AAD07976.1 | (AE000602) conserved hypothetical prot . 
1723816|sp|P45097|YGCF HAEIN HYPOTHETICAL PROTEIN HI1189 >gi | . . . 
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gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 

Query- sid| 114842 | lan |dplORF021 Phage dpi 0RF| 2504-3295) 2 
(263 letters) 

gi 1 127481 1 spl P19465 |GCH1 BACSU GTP CYCLOHYDROLASE I (GTP-CH-I) >... 208 4e-53 

gi 3242315|emb|CAA04237f (AJ00068S) GTP cyclohydrolase (Streptoc... 191 4e-48 

gi 2494695 I spl Q54 769 | GCH1^SYNP7 GTP CYCLOHYDROLASE I (GTP-CH-I) ... 189 2e-47 

gi 255061 I bbs 112832 (S44049) GTP cyclohydrolase I {clone hGCH-1 .. . 187 7e-47 

gi|4503949|ref|NP 000152 . 1 1 PGCH1 1 GTP cyclohydrolase 1 (dopa-res... 187 7e-47 

gi 2113967 |emb|CAB0893S| (Z95557) folE (Mycobacterium tuberculosis) 187 7e-47 

gi 1730240 I spl P50141 1 GCH1 CHICK GTP CYCLOHYDROLASE I (GTP-CH-I) ... 185 3e-46 

gi 2494696 SpQ55759 1 GCH1_SYNY3 GTP CYCLOHYDROLASE I (GTP-CH-I) ... 184 5e-46 

qi 12106l|sp|P22288|GCHl RAT GTP CYCLOHYDROLASE I PRECURSOR (GTP... 184 6e-46 

gi 3183014 | Sp| 013774 | GCHT SCHPO GTP CYCLOHYDROLASE I (GTP-CH-I) ... 184 6e-46 

gi|3097224|emb|CAA18795| TAL023093) GTP cyclohydrolase I (Mycoba... 182 2e-45 

gi 2494697 sp |Q19980 |GCH1 CAEEL PROBABLE GTP CYCLOHYDROLASE I (G... 182 2e-45 

gi 462167lsp|Q05915|GCHl MOUSE GTP CYCLOHYDROLASE I PRECURSOR (G. . . 180 7e-45 

gitl669664|emb|CAA89808f (Z49706) GTP cyclohydrolase I [Dictyost... 180 ie-44 

gi|2981082 (AF052048) GTP -cyclohydrolase [Ostertagia ostertagi] 178 3e-44 

gi|31954lemb|CAA78908| (216418) GTP cyclohydrolase I tHomosapi... 177 8e-44 

gi|551344|bbs|150280 (S71373) GTP cyclohydrolase I [mice, feptid. . . 174 5e-43 

gi 1730247 1 sp | P51601 1 GCHl^YEAST GTP CYCLOHYDROLASE I (GTP-CH-I) ... 174 7e-43 

gi|1246912|emb|CAA87397| (247201) GTP cyclohydrolase 1 [Saccharo. . . 172 2e-42 

gi|l730246 1 spj P51595 |GCH1_STRPN GTP CYCLOHYDROLASE I (GTP-CH-I) ... 168 3e-41 

gi | 2982951 (AE000680) GTP cyclohydrolase I (Aquifex aeolicus) 164 6e-40 

Query= sid | 114843 | lan |dplORF022 Phage dpi 0RF| 30896-31675 | 2 
(259 letters) 

gi|2347102 (U77367) internalin [Listeria monocytogenes) 
gi|3123226|sp|P25146|lNLA_LISMO INTERNALIN A PRECURSOR >gi| 48705. 
gi j 149674 (M67471) internalin [Listeria monocytogenes] 

Query= sid | 114 850 | lan |dplORF029 Phage dpi ORF|662-134B| 2 
(228 letters) 



2650185 (AE001074) succinogly can biosynthesis regulator (exsB. . . 
386123l|emb|CAA1513l| (AJ235272) unknown (Rickettsia prowazekii) 
2622210 (AE000881) conserved protein [Methanobacterium thermo. . . 
2983380 (AE000709) trans- regulatory protein ExsB (Aquifex aeo. . . 
1001327|dbj|BAA10814| (D64006) ExsB [Synechocystis sp.) 
2128055 |pir| | B64468 hypothetical protein homolog MJ1347 - Met... 
4155143 (AE001491) putative (Helicobacter pylori J99) 
2313760|gb|AAD07701.l| (AE000578) conserved hypothetical prot . 
2120814|pir| |S60183 protein ExsB - Rhizobium meliloti >gi|114... 
2633743 emb|CAB13245 | (Z99111) similar to hypothetical protei .. . 
1175543|sp|P44124|YBAX HAEIN HYPOTHETICAL PROTEIN HI1191 >gi| . . . 
2495537|sp|P77756|YBAx"ECOLI HYPOTHETICAL 25. S KD PROTEIN .IN ... 
325647l|dbj|BAA29154.lT (AP000001) 269aa long hypothetical pr. . . 
2921156 (AF022216) aluminum resistance protein [Arthrobacter ... 

Query= sid | 114855 | lan |dplORF034 Phage dpi 0RF| 131-652 | 2 
(173 letters) 

gi|2633746|emb|CAB13248| (299111) similar to hypothetical protei... 
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gi| 4155926 (AE001554) putative (Helicobacter pylori J99] 162 le-39 

gi| 2314588 |gb|AAD08456 .1| (AE000642) conserved hypothetical prot .. . 161 3e-39 

gi | 2983458 (AE000714) hypothetical protein [Aquifex aeolicua) 103 9e-22 

gi|1006604 jdbj |BAA10757| (D64005) hypothetical protein (Synechoc... 87 6e-17 

gi | 2967529 (U11045) unknown (Buchnera aphidicola] 79 2e-14 

gi|2495654|sp|Q46920|YQCD_ECOLI HYPOTHETICAL 32.6 KD PROTEIN IN . . . 69 2e-ll 

gi 1 1175604 j spj P4 4153 j YQCD_HAEIN HYPOTHETICAL PROTEIN HI1291 >gi|... 63 le-09 

gi|3860642 |emb|CAA14543| TaJ235270) unknown (Rickettsia prowazekii) 56 le-07 

Query- sid| 114857 | lan |dplORF036 Phage dpi ORF|48808-49362 | 1 
(184 letters) 

gi| 1353529 (U38906) ORF12 {Bacteriophage rlt] 

Query* sid| 114859 | lan |dplORF038 Phage dpi ORF|l350-187l|3 
(173 letters) 

gi|H75542|sp|P44123|YB90_HAEIN HYPOTHETICAL PROTEIN HI1190 >gi | . . . 
gij 2982977 (AE000681) hypothetical protein [Aquifex aeolicua] 
gi 1 3860744 |emb|CAA14645 1 (AJ235270) unknown [Rickettsia prowazekii] 
gi | 2650193 (AE001074) conserved hypothetical protein [Archaeoglo. . . 
gij 3258383 |dbj|BAA31066.l| (AP000007) 157aa long hypothetical pr... 
gij 1001713 jdbj |BAA10550| (D64004) hypothetical protein [Synechoc... 
gi|4155434 (AE001516) putative [Helicobacter pylori J99] 

Query* sid | 114860 | lan |dplORF039 Phage dpi ORF|3306-3803 |3 
(165 letters) 

gi|l922884|emb|CAA68244| (X99978) ORF7; hydophobic protein (Lact... 

Query* sid| 114862 | lan|dplORF041 Phage dpi ORF| 8208-8699(3 
(163 letters) 

gij 2522313 (AF012906) dUTPase homolog [Bacillus subtilis] >gi|26. 
gij 2634150 |emb|CAB13650| (Z99113) similar to deoxyuridine 5'-tri. 
gij 3913546 j sp | 054134 | DUT_STRCO DEOXYURIDINE 5 ' -TRIPHOSPHATE NUCL. 
gi|3913542|sp|O48500|DUT_BPT5 DEOXYURIDINE 5 ' -TRIPHOSPHATE NUCLE. 
gi | 3913548 | spj 068992 j DUT~CHLTE DEOXYURIDINE 5 1 -TRIPHOSPHATE NUCL. 

Query= sid| 114867 | lan jdplORF04 6 Phage dpi ORF|42774-43202|3 
(142 letters) 

gi|1934764|emb|CAB07984| (Z93946) hypothetical protein [bacterio. . . 287 2e-77 

Query* sid | 114901 | lan |dplORF080 Phage dpi 0RF| 42490-42759 | 1 
(89 letters) 

gi|1934763|emb|CAB07983| (Z93946) hypothetical protein [bacterio... 147 le-35 

Query* sid | 114912 | lan |dplORF091 Phage dpi 0RF| 43189-43413 | 1 
(74 letters) 



64 5e~10 



108 2e-23 

108 3e-23 

56 2e-07 

52 3e-06 

50 le-05 



gi|1934765|erab|CAB07985| (Z93946) holih [bacteriophage Dp-1) 



63 2e-10 
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Table 32 



Sequence of Dpi published by Sheehan and aL 4731 nucleotides. 



l tttaaatttt ttgacaaagt taactcaaat tgtaccgctg aagcaatttt ccatgtattc actcaaagtt 

71 gttcagcgtg gctcaatcat attaaaatcg aacttggtaa tatctctact ccttttagtg aagcagagga 

141 agaccttaaa tatcgaattg actcaaaagc cgatcaaaag ctaactaacc aacagttgac ggcactcacg 

211 gaaaaggctc aactacatga cgcagaactg aaagctaagg ctacaatgga gcagttaagt aacttagaaa 

281 aggcttatga aggtagaatg aaagccaatg aagaagctat caacaaatcg gaacccgacc taatcttagc 

351 ggcaagtcga attgaagcta ctatccaaga acttggcggg ctacgggaac tgaagaagtt cgtcgacagt 

421 tgcatgagct cttctaatca aggtctaatt atcggtaaga acgacggtag ctctaccatt aaggtatcaa 

491 gtgaccgaat ttctatgttc tccgcaggga atgaagttat gtaccttacg caagggttca ttcacatcga 

561 taacgggatc tttacccaat ccattcaagt cggccgattt agaacggaac aatactcgtt taatccagac 

631 atgaacgtga ttcggtatgt aggataagga gaataacatg acaaaattta tcaactcata cggccctctt 

701 cacttgaacc tttacgtcga acaagctagt caggacgtaa cgaacaactc ctcgcgagtt agttggcgag 

771 ctactgtcga ccgcgatgga gcttatcgaa cgtggactta tggaaatatt agtaaccttt ccgtatggtt 

841 aaatggttca agtgttcata gcagtcaccc agactacgac acgtccggcg aagaggtaac gctcgcaagt 

911 ggagaagtga ctgttcctca caatagtgac gggacaaaga caatgtccgt ttgggcttcg tttgacccta 

981 ataacggcgt tcacggaaat atcactatct ctactaatta cactttagac agtattccaa ggtctacaca 

1051 gatttctagt tttgagggaa atcgaaatct aggatcttta catacggtta tctttaaccg aaaagtgaac 

1121 tcttttacgc atcaagtttg gtaccgagtt ttcggtagcg actggataga tttaggtaag aaccatacta 

1191 ctagcgtatc ctttacgccg tcactggact tagcaaggta cttacctaaa tcaagttccg gaacaatgga 

1261 catctgtatt cgaacctata acggaactac gcaaattggt agtgacgtct attcaaacgg acggaggttc 

1331 aacatccccg attcagtacg tcctactttt tcgggcattt ctttagtaga cacgacttca gcggttcgac 

1401 agattttaac agggaacaac ttcctccaaa tcatgtcgaa cattcaagtc aacttcaaca atgcttccgg 

1471 cgcttacgga tccactatcc aagcatttca cgctgagctc gtaggtaaaa accaagctat caacgaaaac 

1541 ggcggcaaat tgggtatgat gaactttaat ggctccgcta ccgtaagagc atgggttaca gacacgcgag 

1611 gaaaacaatc gaacgtccaa gacgtatcta tcaatgttat agaatactat ggaccgtcta tcaatttctc 

1681 cgttcaacgt actcgtcaaa atcctgcaat tatccaagct cttcgaaatg ctaaggtcgc acctataacg 

1751 gtaggaggtc aacagaaaaa catcatgcaa attaccttct ccgtggcgcc gttgaacact actaatttca 

1821 cagaagatag aggttcggcg tcagggacgt tcactactat ttccctactg actaactcgt ccgcgaactt 

1891 agctggtaac tacgggccgg acaagtctta catagttaag gctaaaatcc aagacaggtt cacttcgact 

1961 gaatttagtg ctacggtacc taccgaatca gtagttctta actatgacaa ggacggtcga cttggagttg 

2031 gtaaggttgt agaacaaggg aaggcagggt caattgatgc agcaggtgat atatatgctg gaggtcgaca 

2101 agttcaacag tttcagctca ctgataataa tggagcattg aacaggggtc aatataacga tgttggaata 

2171 agcgtgaaac agagtttaca tggcgaagta acaaatacga ggacaaccct acgggaactc gaggtgaatg 

2241 gggactattt caaaatttct ggttagatag ctggaaaatg gttcaatcct tcattacaat gtcaggaaga 

2311 atgttcatca ggacagcgaa cgatggaaac agctggagac ctaacaagtg gaaagaggtt ctatttaagc 

2381 aagacttcga acagaataat tggcagaaac ttgttcttca aagtgggtgg aaccatcact caacctatgg 

2451 cgacgcattc tattcgaaaa ctcttgacgg catagtatat ttgagaggaa atgtgcataa aggacttatc 

2521 gacaaagagg ctactattgc agtacttcct gaaggattta gaccgaaagt ttcaatgtat cttcaggctc 

2591 tcaataactc atatggaaat gccattctat gtatatacac tgacggaaga cttgtggtga aatcgaatgt 

2661 agataattct tggttaaatt tagacaatgt ctcatttcgt atttaatttg agctgaaatc atgttataat 

2731 attttttaga aaggaggtga gaactatgtt gaaccttaca aaatcgcgcc aaattgtggc agagttcact 

2801 attggacaag gagctgaaaa gaaacttgtc aaaacaacga ttgtgaacat tgatgcaaac gcagtatcaa 

2871 ccgtctctga aactcttcat gacccagact tgtatgctgc gaaccgtcga gaacttcgag ctgacgagca 

2941 aaaacttcgc gaaactcgtt acgcaatcga agatgaaatt aatagctgga gcgggggaaa aaagggggag 

3011 cccggctcta acaggctgaa taaggaggcg tcaatctatg ccaatgtggc taaacgacac cgcagtcttg 

3081 acgacgatta ttacagcgtg cagcggagtg cttactgtcc tactaaataa gttattcgaa tggaaatcga 

3151 ataaagccaa gagcgtttta gaggatatct ctacaactct tagcactctt aaacagcagg tcgacgggat 

3221 tgaccaaacg acagtagcaa tcaatcacca aaatgacgtc attcaagacg gaactagaaa aattcaacgt 

3291 taccgtcttt atcacgactt aaaaagggaa gtgataacag gctatacaac tctcgaccat tttagagagc 

3361 tctctatttt attcgaaagt tataagaacc ttggcggaaa tggtgaagtt gaagccttgt atgaaaaata 

3431 caagaaatta ccaattaggg aggaagattt agatgaaact atctaacgaa caatatgacg tagcaaagaa 

3501 cgtggtaacc gtagtcgttc cagcagcgat tgcactaatt acaggtcttg gagcgttgta tcaatttgac 

3571 actactgcta tcacaggaac cattgcactt cttgcaactt ttgcaggtac tgttctagga gtttctagcc 

3641 gaaactacca aaaggaacaa gaagctcaaa acaatgaggt ggaataatgg gagtcgatat tgaaaaaggc 

3711 gttgcgtgga tgcaggcccg aaagggtcga gtatcttata gcatggactt tcgagacggt cctgatagct 

3781 acgactgctc aagttctatg tactatgctc tccgctcagc cggagcttca agtgctggat gggcagtcaa 

3851 tactgagtac atgcacgcat ggcttattga aaacggttat gaactaatta gtgaaaatgc tccgtgggac 

3921 gctaaacgag gcgacatctt catctgggga cgcaaaggtg ctagcgcagg cgctggaggt catacaggga 

3991 tgttcattga cagtgataac atcattcact gcaaccacgc ctacgacgga atttccgtca acgaccacga 

4061 tgagcgttgg tactatgcag gtcaacctta ctactacgtc tatcgcttga ctaacgcaaa tgctcaaccg 

4131 gctgagaaga aacttggctg gcagaaagat gctactggtt tctggtacgc tcgagcaaac ggaacttatc 

4201 caaaagacga gttcgagtat atcgaagaaa acaagtcctg gttctacttt gacgaccaag gctacatgct 

4271 cgctgagaaa tggttgaaac atactgatgg aaattggtat tggttcgacc gtgacggata catggctacg ... 

4341 tcatggaaac ggattggcga gtcacggtac tacttcaatc gcgatggttc aatggtaacc ggttggatta" 

4411 agtattacga taattggtat tattgtgatg ctaccaacgg cgacatgaaa tcgaatgcgt ttatccgtta 

4481 taacgacggc tggtatctac tattaccgga cggacgtctg gcagataaac ctcaattcac cgtagagccg 

4 551 gacgggctca ttactgctaa agtttaaaat atagagagga ggaagctctt ttcttaatat cgtttctctt 

4621 aatcccgcaa ggtttcgacc ctgcggggtt tatgtgtcgt gaattactct atttacttat tcgaagattt 

4691 caattataat taaataatca acgagattca taattggagg aatg 
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Streptococcus accession numbers 
gi|5776553|gb|AF026471.2|AF026471 [5776553] 

gi|54 1 0470|gb|AF 1 39890. 1 1 AF 1 39890 [54 1 0470] 
gi|5410468|gb|AF139889.1|AF139889 [5410468] 
gi|5410466|gb|AF139888.1|AF139888 [5410466] 
gi|54 10464|gb|AFl 39887. 1 |AF 139887 [54 10464] 
gi|5410462|gb[AFl 39886. 1 |AF 139886 [54 10462] 
gi|5410460|gb|AFl 39885. 1 |AF 139885 [54 10460] 
gi|54 10458|gb|AFl 39884. 1 |AF139884 [54 10458] 
gi|5410456|gb|AF139883.1|AF139883 [5410456] 
gi|3093394|emb|AJ005697.1|SPN5697 [3093394] 
gi|5759208|gb|AF171873.1|AF171873 [5759208] 
gi|5758311|gb|AF162664.1|AF162664 [5758311] 
gi|57393 13|gb]AFl 61701 . 1 jAF 161 701 [5739313] 
gi|57393 10|gb|AFl 61700. 1 |AF 161 700 [57393 10] 
gi|5726354|gb|AF159448.1|AF159448 [5726354] 
gi|5726290|gb|AF127143.1|AF127143 [5726290] 
gi|5712666|gb|AF140784.1|AF140784 [5712666] 
gi|4218525|emb|AJ009639.1|SPAJ9639 [4218525] 
gi|5616524|gb|AF169483.1|AF169483 [5616524] 
gi|5579395|gb|AF162656.1|AF162656 [5579395] 
gi|5579393|gb|AF162655. 1|AF162655 [5579393] 

gi|5578890|cmb|AJ131985.1|SPN131985 
[5578890] 

gi|5566442|gb|AF167442.1|AF167442 [5566442] 

gi|5459332|cmblAJ243540.1|EVE243540 
[5459332] 

gi|5305398|gb|AF0728 1 1 . 1 |AF0728 1 1 [5305398] 

gi|5295921|emb|AJ242698.1|SPN242698 

[5295921] 

gi|5295920|emb|AJ242697. 1 |SPN242697 
[5295920] 

gi|5295919|emb|AJ242696.1|SPN242696 
[5295919] 

gi|52959 1 8|cmb|AJ242695. 1 |SPN242695 
[5295918] 

gi|4583522|gb|AF140356. 1IAF140356 [4583522] 
gi[523 1 206|gb|AF 1 57826. 1 |AF1 57826 [5231206] 
gi|5231203|gb|AF157825.1|AF157825 [5231203] 



gi|5231200|gb|AF157824.1|AF157824 [5231200] 
gi|5231 197|gblAF157823.1|AF157823 [5231 197] 
gi|5231194|gb|AF157822.1|AF157822 [5231194] 
gi|523 1 1 9 1 |gb|AF 1 57 82 1 . 1 1 AF 1 5782 1 [5231191] 
gi|523 1 1 88|gb]AF 1 57820. 1 1 AF 1 57820 [5231 188] 
gi|5231185|gb|AF157819.1|AF157819 [5231185] 
gi|5231182|gb|AF157818.1|AF157818 [5231182] 
gi|5231 179|gb|AF157817.1|AF157817 [5231 179] 
gi|4336851|gb|AF106138.1|AF106138 [4336851] 
gi|4336848|gb|AF106I37.1|AF106137 [4336848] 
gi|4336845|gb|AF106136.1|AF106136 [4336845] 
gi|4336842|gb|AF106135.1|AF106135 [4336842] 
gi|4336839|gb|AF106134.1|AF106134 [4336839] 
gi|4336836|gb|AF106133.1|AF106133 [4336836] 
gi|4336833|gb|AF106132.1|AF106132 [4336833] 
gi|3907597|gb|AF094575. 1 |AF094575 [3907597] 
gi|5030425|gb|AF061748.2|AF061748 [5030425] 

gi|4902881|emb|AJ239004.1 ISPN239004 
[4902881] 

gi|5001710|gb[AF112358.1|AF112358 [5001710] 
gi|5001690|gb|AF106539.1|AF106539 [5001690] 
gi|4973271|gb|AF144420.1|AF144420 [4973271] 
gi|4973269|gb|AF1444 19. 1 |AF 1444 19 [4973269] 
gil4973267|gb|AF144418.1|AF144418 [4973267] 
gi|4928 1 90|gb[AF 129757. 1 |AF 1 29757 [4928 190] 
gi|4927743|gb|AF126061.1|AF126061 [4927743] 
gi|4927742|gb|AF126060.1|AF126060 [4927742] 
gi|492774 1 |gb|AF 1 26059. 1 |AF 1 26059 [492774 1 ] 
gi|4495247|emb|AJ240675.1|SPN240675 
[4495247] 

gi|4495245|emb|AJ240670.1|SPN240670 
[4495245] 

gi|4495243|emb|AJ240669.1|SPN240669 
[4495243] . . 

gi|449524 1 |emb|AJ240668. 1 |SPN240668 
[4495241] 

gi!4495239|emb|AJ240667.1|SPN240667 
[4495239] 
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gi|4495237|emb|AJ240666.1|SPN240666 
[4495237] 

gi|4495235|emb|AJ240665. 1 |SPN240665 
[4495235] 

gi|4495233|emb|AJ240664. 1 |SPN240664 
[4495233] 

gi|4495231|emb|AJ240663.1|SPN240663 
[4495231] 

gi|4495229|cmb| A J240662. 1 |SPN240662 
[4495229] 

gi|4495227|emb|AJ24066U |SPN240661 
[4495227] 

gi|4495225|erab|AJ240660.1|SPN240660 
[4495225] 

gi|4495223|emb|AJ240659.1|SPN240659 
[4495223] 

gi|4495221|emb|AJ240658.1|SPN240658 
[4495221] 

gi|4495219|emb|AJ240657.1|SPN240657 
[4495219] 

gi|4495217|emb|AJ240656.1|SPN240656 
[4495217] 

gi|44952 15|emb|AJ240655. 1 |SPN240655 
[4495215] 

gi|4495213|emb|AJ240654.1|SPN240654 
[4495213] 

gi|4495211|emb|AJ240653.1|SPN240653 
[4495211] 

gi|4495209|einb|AJ240652.1|SPN240652 
[4495209] 

gi|4495207|emb| AJ24065 1 . 1 |SPN24065 1 
[4495207] 

gi|4495205|emb|A J240650. 1 |SPN240650 
[4495205] 

gi|4495203|emb|AJ240649.1|SPN240649 
[4495203] 

gi|449520 i |emb|AJ240648. 1 |SPN240648 
[4495201] 

gi|4495 199|emb|AJ240647.1 |SPN240647 
[4495199] 

gi|4495 197|emb|AJ240644. 1 [SPN240644 
[4495197] 

gi|4495 195|emb|AJ240643. 1 |SPN240643 
[4495195] 

gi|4495 1 93 |emb| AJ240642. 1 (SPN240642 
[4495193] 

gi(4495 1 9 1 |emb| A J24064 1 . 1 |SPN24064 1 
[4495191] 
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gi|4495 1 89|emb| A J240640. 1 |SPN240640 
[4495189] 

gi|4495 1 87|emb| AJ240639. 1 |SPN240639 
[4495187] 

gi|4495 1 85|cmb|AJ240638 . 1 |SPN240638 
[4495185] 

gi|4495 1 83|emb| AJ240637. 1 |SPN240637 
[4495183] 

gi|4495 1 8 1 |emb|AJ240636. 1 |SPN240636 
[4495181] 

gi|4495 179|emb|AJ240635. 1 |SPN240635 
[4495179] 

gi|4495 1 77|emb| AJ240634. 1 |SPN240634 
[4495177] 

gi|4495 1 75|emb|A J240633. 1 |SPN240633 
[4495175] 

gi|4495173|emb|AJ240630.1|SPN240630 
[4495173] 

gi|4495 171 |erob|AJ240629. 1 |SPN240629 
[4495171] 

gi|4495 1 69|emb| AJ240628. 1 |SPN240628 
[4495169] 

gi|4495 1 67|emb|AJ240627. 1 |SPN240627 
[4495167] 

gi|4495165|emb|AJ240626.1|SPN240626 
[4495165] 

gi|4495 1 63|emb|AJ240625. 1 [SPN240625 
[4495163] 

gi|4495 161 |emb|AJ240624. 1 |SPN240624 
[4495161] 

gi|4495 1 59|emb| A J240623 . 1 |SPN240623 
[4495159] 

gi|4495 1 57|emb|AJ240622. 1 |SPN240622 
[4495157] 

gi|4495 1 55|emb|AJ24062 1 . 1 |SPN24062 1 
[4495155] 

gi|4495 1 53 |emb|AJ240620. 1 |SPN240620 
[4495153] 

gi|4495 1 5 1 |emb|AJ2406 1 9. 1 |SPN2406 1 9 
[4495151] 

gi|4495 1 49|emb|AJ2406 1 6. 1 |SPN2406 1 6 
[4495149] 

gi|4495 147|emb|AJ2406 15.1 |SPN240615 
[4495147] " 

gi|4495 145|emb|AJ240614. 1 |SPN240614 
[4495145] 

gi|4495143|emb|AJ240613.1|SPN240613 
[4495143] 
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gi|4495 1 4 1 |emb| A J2406 12.1 |SPN2406 1 2 
[4495141] 

gi|4495139|emb|AJ24061 1.1|SPN24061 1 
[4495139] 

gi|4495 1 37|emb| A J2406 1 0. 1 [SPN2406 1 0 
[4495137] 

gi|4495 1 35|emb|AJ240609. 1 [SPN240609 
[4495135] 

gi|4495 1 33 |emb| A J240608. 1 |SPN240608 
[4495133] 

gi|4495131|cmb[AJ240607.1|SPN240607 
[4495131] 

gi|4495 1 29|emb|AJ240606. 1 [SPN240606 
[4495129] 

gi|4883698|gb|AF079807. 1 |AF079807 [4883698] 

gi|4838562|gb|AF145055.1|AF145055 [4838562] 

gi|4063727|gb|U9324.1|STRINTE [4063727] 

gi|3093401|emb|AJ005619.1|SPAJ5619 [3093401] 

gi|4 1 03889|gb|AF0293 68. 1 1 AF029368 [4103889] 

gi|2897689|dbj|D63805. 1 |D63805 [2897689] 

gi|4566771|gb|AFl 1774L1|AF1 17741 [4566771] 

gi|4566768|gb|AFl 17740.1 |AF1 17740 [4566768] 

gi|4538836|cmb|AJ240793.1|SPN240793 
[4538836] 

gi|4538832|emb|AJ240792.1|SPN240792 
[4538832] 

gi|4538828|emb|AJ24079 1 . 1 |SPN24079 1 
[4538828] 

gi|4538824|emb|A J240790. 1 |SPN240790 
[4538824] 

gi|453882 1 |cmb| AJ240789 . 1 |SPN240789 
[4538821] 

gi|4538818|emb|AJ240788.1|SPN240788 
[4538818] 

gi|4538815|emb|AJ240787.1|SPN240787 
[4538815] 

gi|4538812|emb|AJ240786.1|SPN240786 
[4538812] 

gi|4538809|emb|A J240785. 1 |SPN240785 
[4538809] 

gi|4538806|emb|AJ240784.1|SPN240784 
[4538806] 

gi|4538803|emb|AJ240783.1|SPN240783 
[4538803] 

gi|4538800|emb|AJ240782. 1 |SPN240782 
[4538800] 
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gi|4538797temb|AJ24078 1 . 1 [SPN24078 1 
[4538797] 

gi|4538794|emb|AJ240780. 1 (SPN240780 
[4538794] 

gi|453879 1 |emb|AJ240779. 1 |SPN240779 
[4538791] 

gi|4538788|emb|AJ240778.1|SPN240778 
[4538788] 

gi|4538785|emb|AJ240777. 1 |SPN240777 
[4538785] 

gi|4538782|emb|AJ240776. 1 [SPN240776 
[4538782] 

gi|4538779|emb|A J240775. 1 |SPN240775 
[4538779] 

gi|4538776|emb|AJ240774. 1 |SPN240774 
[4538776] 

gi|4538773|emblAJ240773 . 1 |SPN240773 
(4538773] 

gi|4538770|emb|AJ240772.1|SPN240772 
[4538770] 

gi|4538767|emb|A J24077 1 . 1 |SPN24077 1 
[4538767] 

gi|4538764|emb|AJ240770.1|SPN240770 
[4538764] 

gi|453876 1 |emb|AJ240769. 1 |SPN240769 
[4538761] 

gi|453 8758|emb|A J240768. 1 (SPN240768 
[4538758] 

gi|4538755|emb|AJ240767. 1 (SPN240767 
[4538755] 

gi|4538752|emb|AJ240766. 1 [SPN240766 
[4538752] 

gi|4538749|emb|AJ240765. 1 ISPN240765 
[4538749] 

gi|4538746|emb|AJ24076 1 . 1 |SPN24076 1 
[4538746] 

gi|4538743|emb|AJ240760. 1 |SPN240760 
[4538743] 

gi|4538740|emb| AJ240759. 1 |SPN240759 
[4538740] 

gi|4538737|emb|A J240758. 1 |SPN240758 
[4538737] 

gi|4538734|emb|AJ240757. 1 |SPN240757 
[4538734] - 1 

gi|453873 l|cmb|AJ240756. 1|SPN240756 
[4538731] 

gi|4538728|emb|AJ240755.1|SPN240755 
[4538728] 
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gi|4538725|emb|A J240754. 1 |SPN240754 
[4538725] 

gi|4538722|emb|AJ240753.1|SPN240753 
[4538722] 

gi|4538719|emb|AJ240752.1|SPN240752 
[4538719] 

gi|45387 16|emb|AJ24075 1 . 1 |SPN24075 1 
[4538716] 

gi|4538713|emb|AJ240750.1|SPN240750 
[4538713] 

gi|4538710|emb|AJ240749.1|SPN240749 
[4538710] 

gi|4538707|emb|AJ240748.1|SPN240748 
[4538707] 

gi|4538704|emb|AJ240747.1|SPN240747 
[4538704] 

gi|4538701 |emb|AJ240746. 1 |SPN240746 
[4538701] 

gi|4538698|emb|AJ240745.1|SPN240745 
[4538698] 

gi|4538695lemb|AJ240744. 1 |SPN240744 
[4538695] 

gi|4538692|emb|AJ240743.1|SPN240743 
[4538692] 

gi|4538689|emb|A J240742. 1 |SPN240742 
[4538689] 

gi|4538686|emb| A J24074 1 . 1 |SPN24074 1 
[4538686] 

gi|4538683|emb|AJ240740. 1 |SPN240740 
[4538683] 

gi|4538680|emb|AJ240739.1|SPN240739 
[4538680] 

gi|4538677|emb|AJ240738.1|SPN240738 
[4538677] 

gi|4530444|gb|AFl I8229.1|AF1 18229 [4530444] 
gi|4519253|dbj|AB015852.1|AB015852 [4519253] 
gi|45 1 925 1 |dbj (ABO 15851.1 1 ABO 15851 [4519251] 
gi|45 19249|dbj|AB01 5850. 1 |AB015850 [45 1 9249] 
gi|4519247|dbj|AB015849.1|AB015849 [4519247] 
gi|4519245|dbj|AB01 5848.1 [ABO 15848 [4519245] 
gi|4519243|dbj|AB015847,l|AB015847 [4519243] 
gi|45 1924 1 |dbj| AB01 5 846. 1 1 ABO 1 5846 [4519241] 
gi|45 1 9239|dbj| ABO 11210.1 |ABO 1 12 1 0 [4 5 1 9239] 
gi|4519237|dbj|AB01 1209.1|AB01 1209 [4519237] 
gi|45 1 9235|dbj|AB0 1 1 208 . 1 1 ABO 1 1 208 [4519235] 



gi|4519233|dbj|AB01 1207.1 1 ABO 1 1207 [4519233] 

gi|45 1 923 1 |dbj | ABO 1 1 206. 1 1 ABO 1 1 206 [45 1 923 1 ] 

gi|4519229|dbj|AB011205.1|AB011205 [4519229] 

gi|4519227|dbj|AB01 1204.1|AB011204 [4519227] 

gi|45 1 9225|dbj|AB0 1 1 203 . 1 |ABO 1 1 203 [4519225] 

gi|4519223|dbj|AB011202.1|AB011202 [4519223] 

gi|45 1 922 1 |dbj| ABO 11201.1 1 ABO 11201 [4519221] 

gi|45 192 19|dbj|AB01 1 200. 1 |AB01 1200 [4519219] 

gi[45 1 92 1 7|dbj| ABO 1 1 1 99. 1 1 ABO 1 1 1 99 [4519217] 

gi|45 192 1 5|dbj| ABO 1 1 1 98. 1 1 ABO 1 1 1 98 [45 1 92 1 5] 

gi|4495 1 27|emb|AJ240605. 1 |SPN240605 
[4495127] 

gi|4468031|emb|AJ132957.1|SPN132957 
[4468031] 

gi|4468029|cmb|AJ132956.1|SPN132956 
[4468029] 

gi|42 1 8532|emb| A JO 1 03 1 2. 1 |SPN01 03 1 2 
[4218532] 

gi|4456852|emb|AJ236792. 1 |SPN236792 
[4456852] 

gi|4456850|emb|AJ23679 1 . 1 |SPN23679 1 
[4456850] 

gi|4456848|emb|AJ236790.1|SPN236790 
[4456848] 

gi|4456846|emb|AJ236789.1|SPN236789 
[4456846] 

gi|3550644|emb|AJ006987.1|SPAJ6987 [3550644] 

gi|3550625|emb|AJ006986. 1 |SPAJ6986 [3550625] 

gi|44 1 65 1 8|gb|AF0 1 4458.2| AFO 1 4458 [44 1 65 1 8] 

gi|4406260|gb|AF105116.1|AF105116 [4406260] 

gi|4406257|gb|AF105 115.1 |AF1051 15 [4406257] 

gi|4406254|gb|AF105114.1|AF1051 14 [4406254] 

gi|4406246|gb|AF1051 13.1|AF1051 13 [4406246] 

gi|4406243|gb|AF105 112.1 |AF105 1 12 [4406243] 

gi|4 1 38533|emb|AJ0058 15.1 |SPN58 1 5 [4 1 38533] 

gi|3821726|emb|AJ232433.1|SPN232433 
[3821726] 

gi|3821724|emb|AJ232432.1|SPN232432 
[3821724] 

gi|3821722|emb|AJ232431.1|SPN232431-- ~ 
[3821722] 

gi|382 1 720|emb|AJ232430. 1 |SPN232430 
[3821720] 
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gi|382 1 7 1 8|emb|A J232429. 1 |SPN232429 
[3821718] 

gi|382 1 7 1 6|emb| AJ232428. 1 |SPN232428 
[3821716] 

gi|382 1 7 1 4|emb|A J232427. 1 |SPN232427 
[3821714] 

gi|3821712|emb|AJ232426.1|SPN232426 
[3821712] 

gi|3821710|emb|AJ232425.1|SPN232425 
[3821710] 

gi|3821708|emb|AJ232424.1|SPN232424 
[3821708] 

gi|3821 706|emb|AJ232423. 1 |SPN232423 
[3821706] 

gi|382 1704|emb|AJ232422. 1 |SPN232422 
[3821704] 

gi|382 1702|emb|AJ23242 1 . 1 (SPN23242 1 
[3821702] 

gi|3821700|emb|AJ232420.1|SPN232420 
[3821700] 

gi|382 1 698|emb|AJ2324 1 9. 1 |SPN2324 1 9 
[3821698] 

gi|382 1 696|emb|AJ2324 18.1 |SPN2324 1 8 
[3821696] 

gi|3821694|emb|AJ2324 17. 1 |SPN2324 1 7 
[3821694] 

gi|382 1692|cmb|AJ2324 1 6. 1 |SPN2324 1 6 
[3821692] 

gi|382 1690|emb|AJ2324 15.1 (SPN2324 1 5 
[3821690] 

gi|382 1688|emb|AJ2324 14. 1 (SPN2324 14 
[3821688] 

gi|3821686|emb|AJ232413. 1 |SPN232413 
[3821686] 

gi|382 1 684|emb| A J2324 12.1 (SPN2324 1 2 
[3821684] 

gi|3821682|emb|AJ23241 1.1JSPN23241 1 
[3821682] 

gi|3821680|cmb|AJ232410.1|SPN232410 
[3821680] 

gi|3821678|emb|AJ232409.1|SPN232409 
[3821678] 

gi|3821676|emb|AJ232408. 1 |SPN232408 
[3821676] 

gi|3821674|emb|AJ232407.1|SPN232407 
[3821674] 

gi|3821672|emb|AJ232406.1|SPN232406 
[3821672] 
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gi|382 1 670|emb|AJ232405 . 1 ISPN232405 
[3821670] 

gi|3821668|emb|AJ232404.1|SPN232404 
[3821668] 

gi|3821 666|emb|AJ232403. 1 |SPN232403 
[3821666] 

gi|3821664|emb|AJ232402.1|SPN232402 
[3821664] 

gi|382 1 662|emb|AJ23240 1 . 1 |SPN23240 1 
[3821662] 

gi|3821660|emb|AJ232399. 1 |SPN232399 
[3821660] 

gi|382 1 658|emb[AJ232398. 1 (SPN232398 
[3821658] 

gi|3821656|emb|AJ232397. 1 |SPN232397 
[3821656] 

gi|3821654|emb|AJ232396. 1 |SPN232396 
[3821654] 

gi|382 1652|emb|AJ232395. 1 |SPN232395 
[3821652] 

gi|3821650|emb|AJ232394. 1 (SPN232394 
[3821650] 

gi|3821648|emb|AJ232393.1|SPN232393 
[3821648] 

gi|3821646|emb|AJ232392.1|SPN232392 
[3821646] 

gi|382 1 644|emb|AJ23239 1 . 1 |SPN23239 1 
[3821644] 

gi|3821642|emb|AJ232390.1|SPN232390 
[3821642] 

gi|3821640|emb|AJ232389.1|SPN232389 
[3821640] 

gi|3821638|emb|AJ232388.1|SPN232388 
[3821638] 

gi|3821636|emb|AJ232387.1|SPN232387 
[3821636] 

gi|3821 634|emb|AJ232386. 1 |SPN232386 
[3821634] 

gi|3821632|emb|AJ232385.1|SPN232385 
[3821632] 

gi|382 1630|emb|AJ232384. 1 |SPN232384 
[3821630] 

gi|3821628|emb|AJ232383.1|SPN232383 
[3821628] - 1: 

gi!3821626|emb|AJ232382.1|SPN232382 
[3821626] 

gi|382 1 624|emb| AJ23238 1 . 1 (SPN23238 1 
[3821624] 
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gi|382 1 622|emb| A J2323 80. 1 |SPN2323 80 
[3821622] 

gi|3821620|emb|AJ232379.1|SPN232379 
[3821620] 

gi|3821618|emb|AJ232378.1|SPN232378 
[3821618] 

gi|3821616|emb|AJ232377.1|SPN232377 
[3821616] 

gi|3821614|emb|AJ232376.1|SPN232376 
[3821614] 

gi|3821612|emb|AJ232375.1|SPN232375 
[3821612] 

gi|3821610|emb|AJ232373.1|SPN232373 
[3821610] 

gi|3821608|emb|AJ232372.1|SPN232372 
[3821608] 

gi|382 1 606|emb|AJ23237 1 . 1 |SPN23237 1 
[3821606] 

gi|3821604|emb|AJ232370.1|SPN232370 
[3821604] 

gi|3821602|emb|AJ232369.1|SPN232369 
[3821602] 

gi|3821600|emb|AJ232368.1|SPN232368 
[3821600] 

gi|3821598|emb|AJ232367.1|SPN232367 
[3821598] 

gi|3821596|emb|AJ232366.1|SPN232366 
[3821596] 

gi|3821 594|emb|AJ232365. 1 |SPN232365 
[3821594] 

gi|3820454|emb|AJ007367.1|SPN7367 [3820454] 

gi|3821592|emb|AJ232364. 1 |SPN232364 
[3821592] 

gi|3821590|emb|AJ232363.1|SPN232363 
[3821590] 

gi|3821588|emb|AJ232362.1|SPN232362 
[3821588] 

gi|3821586|emb|AJ232361.1|SPN232361 
[3821586] 

gi|3821584|emb|AJ232360.1|SPN232360 
[3821584] 

gi|3821582|emb|AJ232359.1|SPN232359 
[3821582] 

gi|3821580|emb|AJ232358.1|SPN232358 
[3821580] 

gi|3821578|emb|AJ232357.1|SPN232357 
[3821578] 
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giP821576|emb|AJ232356.1|SPN232356 
[3821576] 

gi|382 1 574|emb|AJ232355.1 ISPN232355 
[3821574] 

gi|3 82 1 572|emb! AJ232353. 1 |SPN232353 
[3821572] 

gi|3821 570|emb|AJ232352.1 |SPN232352 
[3821570] 

gi|3 82 1 568|emb| AJ23235 1 . 1 |SPN23235 1 
[3821568] 

gi|3821566|emblAJ232350.1|SPN232350 
[3821566] 

gi|3821564|emb|AJ232349.1|SPN232349 
[3821564] 

gi|3821562|emb|AJ232348.1|SPN232348 
[3821562] 

gi|382 1 560|emb|AJ232347. 1 |SPN232347 
[3821560] 

gi|3821558|emb|AJ232346.1|SPN232346 
[3821558] 

gi|3821556|emb|AJ232345.1|SPN232345 
[3821556] 

gi|382 1 554|emb|AJ232344. 1 |SPN232344 
[3821554] 

gi|3821552|emb|AJ232343. 1 |SPN232343 
[3821552] 

gi|382 1 550|emb|AJ232342. 1 |SPN232342 
[3821550] 

gi|382 1 548|emb|AJ23234 1 . 1 |SPN23234 1 
[3821548] 

gi|382 1 546|emb|AJ232340. 1 |SPN232340 
[3821546] 

gi|3821544|emb|AJ232339.1|SPN232339 
[3821544] 

gi|3821542|emb|AJ232338.1|SPN232338 
[3821542] 

gi|382 1 540|emb| AJ232337. 1 |SPN232337 
[3821540] 

gi|382 1 538|emb|AJ232336. 1 |SPN232336 
[3821538] 

gi|3821536|emb|AJ232335.1|SPN232335 
[3821536] 

gi|3821534jcmb|AJ232334.1|SPN232334 
[3821534] 

gi|382 1532|emb|AJ232333. 1 |SPN2l2333 
[3821532] 

gi|3821530|emb|AJ232332.1|SPN232332 
[3821530] 
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gi|3821528|emb!AJ232331.1|SPN232331 
[3821528J 

gi|382 1526|erabl AJ232330. 1 (SPN232330 
[3821526] 

gi|382 1 524|emb|A J232329. 1 |SPN232329 
[3821524] 

gi|3821522|emb!AJ232328.1|SPN232328 
[3821522] 

gi|3821520|emb|AJ232327.1|SPN232327 
[3821520] 

gi|3821518|emb|AJ232326.1|SPN232326 
[3821518] 

gi|3821516|emb|AJ232325.1|SPN232325 
[3821516] 

gi|3821514|emb|AJ232324.1|SPN232324 
[3821514] 

gi|3821512|emb|AJ232322.1|SPN232322 
[3821512] 

gi|382 1 5 10|emb|AJ23232 1 . 1 [SPN23232 1 
[3821510] 

gi|3821508|emb|AJ232320.1|SPN232320 
[3821508] 

gi|3 82 1 506|emb| AJ2323 19.1 (SPN2323 1 9 
[3821506] 

gi|3821504|emb|AJ232318.1|SPN232318 
[3821504] 

gi|3821502|emb|AJ232317.1|SPN232317 
[3821502] 

gi|3821500|emb|AJ2323 1 6. 1 [SPN2323 1 6 
[3821500] 

gi|3821498|emb|AJ2323 1 5. 1 |SPN2323 1 5 
[3821498] 

gi|3821496|emb|AJ2323 14.1 |SPN2323 14 
[3821496] 

gi|3821494|emb|AJ232313.1|SPN232313 
[3821494] 

gi|382 1492|emb|AJ2323 12.1 (SPN2323 12 
[3821492] 

gi|3821490|emb|AJ23231 1.1|SPN23231 1 
[3821490] 

gi|382 1488|emb|AJ2323 10.1 |SPN2323 10 
[3821488] 

gi|3821486|emb|AJ232309.1|SPN232309 
[3821486] 

gi|3821484|emb|AJ232308.1|SPN232308 
[3821484] 

gi|3821482|emb|AJ232307.1|SPN232307 
[3821482] 



gif3821480|emb|AJ232306.1|SPN232306 
[3821480] 

gi|3821478|emb|AJ232305.1|SPN232305 
[3821478] 

gi|3821476|emb|AJ232304.1|SPN232304 
[3821476] 

gi|3821474|emb|AJ232303.1|SPN232303 
[3821474] 

gi|3821472|emb|AJ232302.1|SPN232302 
[3821472] 

gi|382 1 470|emb|A J23230 1 . 1 |SPN23230 1 
[3821470] 

gi|3821 468|erab[AJ232300. 1 |SPN232300 
[3821468] 

gi|3821466|emb|AJ232299.1|SPN232299 
[3821466] 

gi|3821464|emb|AJ232298.1|SPN232298 
[3821464] 

gi|382 1462|emb|AJ232297. 1 |SPN232297 
[3821462] 

gi|3821460|emb|AJ232295.1|SPN232295 
[3821460] 

gi|3821458|emb|AJ232294.1|SPN232294 
[3821458] 

gi|3821456|emb|AJ232293.1|SPN232293 
[3821456] 

gi|3821454|emb|AJ232292.1|SPN232292 
[3821454] 

gi|3821452|emb|AJ232291.1|SPN232291 
[3821452] 

gi|382 1 450|emb|A J232290. 1 |SPN232290 
[3821450] 

gi|3821448|emb|AJ232289.1|SPN232289 
[3821448] 

gi|3821446|emb|AJ232288.1|SPN232288 
[3821446] 

gi|3821444|emb|AJ232287.1|SPN232287 
[3821444] 

gi|3821442|emb|AJ232286.1|SPN232286 
[3821442] 

gi|3821440|emb|AJ232285.1|SPN232285 
[3821440] 

gi|3821438|emb|AJ232284.1|SPN232284 
[3821438] 

gi|3821436|emb|AJ232283.1|SPN232283 
[3821436] 

gi|382 1434|cmb|AJ232282. 1 |SPN232282 
[3821434] 
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gi|382 1432|emb|AJ23228 i . 1 |SPN23228 1 
[3821432] 

gi|3821430|emb|AJ232280.1|SPN232280 
[3821430] 

gi|382 1428|emb|AJ232279. 1 |SPN232279 
[3821428] 

gi|3821426|emb|AJ232278.1|SPN232278 
[3821426] 

gi|3821424|emb|AJ232276. 1 (SPN232276 
[3821424] 

gi|3821422|emb|AJ232275.1|SPN232275 
[3821422] 

gi|3821420|emb|AJ232274.l(SPN232274 
[3821420] 

gi|3821418|cmb|AJ232273.1|SPN232273 
[3821418] 

gi|3821416|emb|AJ232272.1|SPN232272 
[3821416] 

gi|3821414|emb|AJ232271.1|SPN232271 
[3821414] 

gi|3821412|emb|AJ232270.1|SPN232270 
[3821412] 

gi|3821410|emb|AJ232269.1|SPN232269 
[3821410] 

gi|3821408|emb|AJ232268.1|SPN232268 
[3821408] 

gi|3821406|emb|AJ232267.1|SPN232267 
[3821406] 

gi|3821404|emb|AJ232266.1|SPN232266 
[3821404] 

gi|3821402|emb|AJ232265.1|SPN232265 
[3821402] 

gi|3821400|emb|AJ232264.i|SPN232264 
[3821400] 

gi|3821398|emb|AJ232263.1|SPN232263 
[3821398] 

gi|3821396|emb|AJ232262.1|SPN232262 
[3821396] 

gi|3821394|emb|AJ232261.1|SPN232261 
[3821394] 

gi|3821392|emb|AJ232260.1|SPN232260 
[3821392] 

gi|3821390|cmb|AJ232259.1|SPN232259 
[3821390] 

gi|3821388|emb|AJ232258.1|SPN232258 
[3821388] 

gi|382 1 386|emb|A J232257. 1 |SPN232257 
[3821386] 



gi|3821384|emb|AJ232256.1|SPN232256 
[3821384] 

gi|3821382|emb|AJ232255.1|SPN232255 
[3821382] 

gi|3821380|emb|AJ232254.1|SPN232254 
[3821380] 

gi|3821378|emb|AJ232253.1|SPN232253 
[3821378] 

gi|3821 376|emb|AJ232252. 1 |SPN232252 
[3821376] 

gi|3821374|emb|AJ232251.1|SPN232251 
[3821374] 

gi|3821372|cmb|AJ232250.1|SPN232250 
[3821372] 

gi|3821370|emb|AJ232249.1|SPN232249 
[3821370] 

gi|3821367|emb|AJ232248.1|SPN232248 
[3821367] 

gi|3821365|emb|AJ232247.1|SPN232247 
[3821365] 

gi|3 82 1 363|emb| A J232246. 1 |SPN232246 
[3821363] 

gi|3821361|emb|AJ232245.1|SPN232245 
[3821361] 

gi|3821359|emb|AJ232244.1|SPN232244 
[3821359] 

gi|3821357|emb|AJ232243.1|SPN232243 
[3821357] 

gi|3821355|emb|AJ232241.1|SPN232241 
[3821355] 

gi|2921842|gb|AF047385.1|AF047385 [2921842] 

gi|2909863|gb|AF047696.1|AF047696 [2909863] 

gi|4193353|gb|AFO55O88.1|AF055088 [4193353] 

gi|4 1 85242|gb| AH007276. 1 |SEG_SPTNJUNC 
[4185242] 

gi|4 1 85241 |gb| AF066797. 1 |SPTNJUNC2 
[4185241] 

gi|4 1 85240|gb| AF066796. 1 1 SPTN JUNC 1 
[4185240] 

gi|4097979|gb|U72655. 1 [SPU72655 [4097979] 
gi|4063720|gb|U9323. 1 [STRMTR [4063720] 
gi|1657605|gb|U66846.1|SPU66846 [1657605] ^ r 
gi| 1 657602|gb|U66845. 1 |SPU66845L[ 1557602] 
gi|4009485|gbJAF068903. 1 |AF068903 [4009485] 
gi|4009477|gb| AF068902. 1 [AF068902 [4009477] 
gi|4009462|gb|AF068901 . 1 [AF068901 [4009462] 
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gi|3947767|emb|AJ233896.1|SPN233896 
[3947767] 

gi|3947765|emb|AJ233895. 1 |SPN233895 
[3947765] 

gi|3947763|emb|AJ233894. 1 |SPN233894 
[3947763] 

gi|3947761 |emb|AJ233893. 1 (SPN233893 
[3947761] 

gi|3947759|emb|AJ233892.1|SPN233892 
[3947759] 

gi|3947757|emb|AJ23389 1 . 1 |SPN23389 1 
[3947757] 

gi|3947755|emb|AJ233890.1|SPN233890 
[3947755] 

gi|3947753|emb|AJ233889. 1 |SPN233889 
[3947753] 

gi|394775 1 |emb|AJ233888. 1 |SPN233888 
[3947751] 

gi|3947749|cmb|AJ233887.1|SPN233887 
[3947749] 

gi|3947730|emb|AJ233886.1|SPN233886 
[3947730] 

gi|3758891|cmb|Z71552.1|SPADCA [3758891] 

gi|3818479|gb|AF057294.1tAF057294 [3818479] 

gi|2351767|gb|U89711.1|SPU89711 [2351767] 

gi|3395661|dbj|AB006879.1lAB006879 [3395661] 

gi|3395659|dbj|AB006878. 1 1 AB006878 [3395659] 

gi|3395657|dbj|AB006877.1|AB006877 [3395657] 

gi|3395655|dbj|AB006876. 1|AB006876 [3395655] 

gi|3395653|dbj|AB006875.1[AB006875 [3395653] 

gi|339565 1 |dbj| AB006874. 1 |AB006874 [339565 1 ] 

gi|3395649|dbj|AB006873.1|AB006873 [3395649] 

gi|3395647|dbj|AB006872. 1 |AB006872 [3395647] 

gi|3395645|dbj|AB0O687 1 . 1 |AB00687 1 [3395645] 

gi|3395643|dbj|AB006870.1|ABOO6870 [3395643] 

gi|3395641|dbj|AB006869.1|AB006869 [3395641] 

gi|3395639|dbj|AB006868. 1 1 AB006868 [3395639] 

gi|23 15992|gb|U87092. 1 |SPU87092 [23 15992] 

gi|2209338|gb]U93576.1|SPU93576 [2209338] 

gi|2 109442|gb|AF000658. 1 |SPDNAARG 
[2109442] 

gi|1881538|gb|U09239.1|SPU09239 [1881538] 
gi|1666904|gb|U76218.1|SPU76218 [1666904] 
gi|1613766|gb|U33315.1|SPU33315 [1613766] 



gi| 1498294|gb|U4 1735.1 |SPU4 1 735 [1498294] 
gi| 1 2 1 3493 |gb|U47687. 1 |SPU47687 [121 3493] 
gi| 1 1 63 1 09|gb|U43526. 1 |SPU43526 [ 1 1 63 1 09] 
gi|556001|gb|U1517U|SPU15171 [556001] 
gi|455063|gb|U02920. 1 |SPU02920 [45 5063] 
gi|784896|gb|L36923.1|STRSTRH [784896] 
gi|3320386|gb|AF030373.1|AF030373 [3320386] 
gi|2804772|gb|AF030374. 1 1 AF030374 [2804772] 
gi|2804762|gb|AF030372. 1 1 AF030372 [2804762] 
gi|2804756|gb|AF030371 . 1 |AF030371 [2804756] 
gi|2804750[gb|AF030370. 1 1 AF030370 [2804750] 
gi|2804745|gb|AF030369. 1 |AF030369 [2804745] 
gi|2804739|gb|AF030368. 1 |AF030368 [2804739] 
gi|2804732|gblAF030367. 1 |AF030367 [2804732] 
gi|2804726|gb|AF030366. 1|AF030366 [2804726] 
gi|2804720|gb|AF030365. 1 |AF030365 [2804720] 
gi|28047 l3|gb|AF030364. 1 |AF030364 [2804713] 
gi|2804707|gb|AF030363.1|AF030363 [2804707] 
gi|2804701|gb|AF030362.1(AF030362 [2804701] 
gi|2804694|gb|AF030361.1 |AF030361 [2804694] 
gi|2804688|gblAF030360. 1 (AF030360 [2804688] 
gi|28O4682|gb|AF030359. 1 |AF030359 [2804682] 
gi|3550979|dbj|AB010387.1|AB010387 [3550979] 
gi|2275 1 00|cmb|AJ000336. 1 |SPR6LDH [2275 100] 
gi|355 1 853|gb|AF076029. 1 |AF076029 [355 1 853] 
gi|355 1 773|gb|U94770. 1 |SPU94770 [355 1 773] 
gi|3550617|emb|AJ004869.1|SPAJ4869 [3550617] 
gi|3513563|gb|AF055727.1|AF055727 [3513563] 
gi|35 1 3561 |gb|AF055726. 1 1 AF055726 [3513561] 
gi|3513559|gb|AF055725.1|AF055725 [3513559] 
gi|35 13557|gb|AF055724. 1 |AF055724 [35 13557] 
gi|35 1 3555|gb[AF055723 . 1 |AF055723 [35 1 3555] 
gi|35 13553|gb|AF055722. 1 |AF055722 [35 1 3553] 
gi|3513549|gb|AF05572U|AF055721 [3513549] 
gi|3513545|gb|AF055720.1|AF055720 [3513545] 
gi|1914869|emb|Z82001.1|SPZ8_2001 [1914869J , 
gi|291 1421 |gb|AF046238. 1 (AF046238 [291 142 1] 
gi|2911419|gb|AFO46237.1|AF046237 [2911419] 
gi|2911417|gb|AF046236.1|AF046236 [2911417] 
gi|291 1 4 1 5|gb|AF046235. 1 [AF046235 [29 11415] 
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gi|291 1413|gb|AF046234.1|AF046234 [291 1413] 
gi|291 141l|gb|AF046233.1|AF046233 [291 141 1] 
gi|2911409|gbtAF046232.1|AF046232 [291 1409] 
gi|291 1407|gb|AF04623U|AF046231 [291 1407] 
gi|2911405|gb|AF046230.1|AF046230 [2911405] 
gi|325860 1 |gb|U40786. 1 |SPU40786 [325860 1 ] 
gi|321 1756|gb|AF052209.1|AF052209 [321 1756] 
gi|3211752|gb|AF052208.1|AF052208 [3211752] 
gi|321 1747|gb|AF052207.1|AF052207 [321 1747] 
gi|32201 94|gb|AF053 121 . 1|AF053 12 1 [3220 194] 
gi|2766052|emb|Z99863.1|SPZ99863 [2766052] 
gi|2766050|emb|Z99862.1|SPZ99862 [2766050] 
gi|2766048|emb|Z99861.1|SPZ99861 [2766048] 
gi|2766046|emb|Z99860. 1|SPZ99860 [2766046] 
gi|2766044|emb|Z99859.1iSPZ99859 [2766044] 
gi|2766042|emb|Z99858.1|SPZ99858 [2766042] 
gi|2766040|emb|Z99857.1|SPZ99857 [2766040] 
gi|2766038|emb|Z99856.1|SPZ99856 [2766038] 
gi|2766036|emb|Z99855.1|SPZ99855 [2766036] 
gi|2766034|emb|Z99854.1|SPZ99854 [2766034] 
gi|2766032|emb|Z99853.1|SPZ99853 [2766032] 
gi|2766030|emb|Z99852.1|SPZ99852 [2766030] 
gi|2766028|emb|Z9985 1 . 1 (SPZ9985 1 [2766028] 
gi|2766026|emb|Z99850. 1 |SPZ99850 [2766026] 
gi|2766024|emb|Z99849. 1 |SPZ99849 [2766024] 
gi|2766022|erab|Z99848. 1 |SPZ99848 [2766022] 
gi|2766020|emb|Z99847. 1 |SPZ99847 [2766020] 
gi|2766018|emb|Z99846.1|SPZ99846 [2766018] 
gi|2766016|emb|Z99845. 1 |SPZ99845 [276601 6] 
gi|2766014|emb|Z99844.1|SPZ99844 [2766014] 
gi|2766012|emb|Z99843.1|SPZ99843 [2766012] 
gi|2766010|emb|Z99842.1|SPZ99842 [2766010] 
gi|2766008|emb|Z9984 1 . 1 [SPZ9984 1 [2766008] 
gi|2766006|emb|Z99840.1 |SPZ99840 [2766006] 
gi|2766004|emb|Z99839. 1 |SPZ99839 [2766004] 
gi|2766002|emb|Z99838.1jSPZ99838 [2766002] 
gi|2766000|emb|Z99837. 1 |SPZ99837 [2766000] 
gi|2765998|emb|Z99828.1|SPZ99828 [2765998] 
gi|2765996|emb|Z99827.1|SPZ99827 [2765996] 
gi|2765994|emb|Z99826. 1 |SPZ99826 [2765994] 



gi|2765992|emb|Z99825.1|SPZ99825 [2765992] 

gi|2765990|emb|Z99824. 1 |SPZ99824 [2765990] 

gi|2765988|emb|Z99823.1|SPZ99823 [2765988] 

gi|2765986|emb|Z99822. 1 [SPZ99822 [2765986] 

gi|2765984|emb|Z9982 1 . 1 |SPZ9982 1 [2765984] 

gi|2765982|emb|Z99820. 1|SPZ99820 [2765982] 

gi|2765980|emb|Z998 1 9. 1 |SPZ998 19 [2765980] 

gi|2765978|emb|Z998 1 8. 1 |SPZ998 1 8 [2765978] 

gi|2765976|emb|Z998 1 7. 1 |SPZ998 1 7 [2765976] 

gi|2765974|cmb|Z998 16. 1 |SPZ998 16 [2765974] 

gi|2765972|emb|Z99815.1|SPZ99815 [2765972] 

gi|2765970|emb|Z99814.1|SPZ99814 [2765970] 

gi|2765968|emb|Z998 1 3. 1 [SPZ998 13 [2765968] 

gi|2765966|emb|Z99812.1|SPZ99812 [2765966] 

gi[2765964|emb|Z998 1 1. 1|SPZ9981 1 [2765964] 

gi|2765962|emb|Z99810.1|SPZ99810 [2765962] 

gi|2765960|emb|Z99809.1|SPZ99809 [2765960] 

gi|2765958|emb|Z99808.1|SPZ99808 [2765958] 

gi|2765956|emb|Z99807.1|SPZ99807 [2765956] 

gi|2765954|emb|Z99806. 1 |SPZ99806 [2765954] 

gi|2765952|emb|Z99805. 1 (SPZ99805 [2765952] 

gi|2765950|emb|Z99804.1|SPZ99804 [2765950] 

gi|2765948|emb|Z99803. 1|SPZ99803 [2765948] 

gi|2894104|emb|X77249.1|SPR6CIARH [2894104] 

gi|3 1 53897|gb|AF067 128.1 1 AF067 1 28 [3 1 53897] 

gi|3 1 527 12|gb|AF065 153.1 (AF065 1 53 [3 1 527 1 2] 

gi|3 1 527 1 0|gb| AF065 1 52 . 1 1 AF065 1 52 [3 1 527 1 0] 

gi|3 1 52708 |gb|AF065 151, 1 |AF065151 [3 152708] 

gi|3 1 16426|gb(U84387.1|SPU84387 [3 1 16426] 

gi|2385403|emb|AJ001247.1|SP7465RR3 
[2385403] 

gi|2342540|emb| AJ001 250. 1 |SP7978RR5 
[2342540] 

gi|2342539|emb|AJ00125 1 .1|SP7978RR3 
[2342539] 

gi|2342538|emb|AJ001248.1|SP7466RR5 
[2342538] 

gi|2342537|emb|AJ001249.1|SP7466RKr ~ 
[2342537] 

gi|3065896|gb|AF058920. 1 1 AF058920 [3065896] 
gi|2982647|emb|AJ002294. 1 |SPAJ2294 [2982647] 
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gi|2982645|emb|AJ002293.l|SPAJ2293 [2982645] 
gi|2982643|emb|AJ002292. 1 [SPAJ2292 [2982643] 
gi|298264 1 |emb|AJ00229 1 . 1 |SPAJ229 1 [298264 1 ] 
gi|1620466|emb|X99400.1 |SPDACAO [1620466] 
gi|2196665|emb|Z84381.1|HSZ84381 [2196665] 
gi|2196663|emb|Z84380.1|HSZ84380 [2196663] 
gi|2196661|emb|Z84379.1|HSZ84379 [2196661] 
gi|2196659|emb|Z84378.1|HSZ84378 [2196659] 
gi|625175|gb|L36131.1|STREXP10A [625175] 
gi|3004945|gb|AF036624.1|AF036624 [3004945] 
gi|3004943|gb| AF036623. 1 |AF036623 [3004943] 
gi|300494 1 |gb|AF036622. 1 [AF036622 [300494 1 ] 
gi|3004939|gb|AF036621 . 1 |AF036621 [3004939] 
gi|3004937|gb| AF036620. 1 |AF036620 [3004937] 
gi|3004935|gb|AF036619.1|AF036619 [3004935] 
gi|2370572|emb|Z861 12.1|SPZ861 12 [2370572] 
gi|2765946|emb|Z99802. 1 |SPZ99802 [2765946] 
gi|2398824|emb|Z34303.1|SPCINREC [2398824] 
gi|2894512|emb|AJ223491.1|SPPPR3 [2894512] 
gi|2198539|emb|X85787.1|SPCPS14E [2198539] 
gi|27661 56|emb|Z9991 5. 1 |SPZ999 1 5 [27661 56] 
gi|2766 1 54|emb|Z999 1 4. 1 |SPZ999 14 [2766 1 54] 
gi|2766152|emb|Z99913.1|SPZ99913 [2766152] 
gi|2766150|emb|Z99912.1|SPZ99912 [2766150] 
gi|2766148|emb|Z99911.1|SPZ99911 [2766148] 
gi|2766 1 46|emblZ999 1 0. 1 |SPZ999 1 0 [2766 1 46] 
gi|2766144|emb|Z99909. 1 |SPZ99909 [2766144] 
gi|2766142|emb|Z99908.1|SPZ99908 [2766142] 
gi|2766140|emb|Z99907.1|SPZ99907 [2766140] 
gi|2766138|emb|Z99906. 1|SPZ99906 [2766138] 
gi|2766136|emb|Z99905.1|SPZ99905 [2766136] 
gi|2766134|emb|Z99904.1|SPZ99904 [2766134] 
gi|2766132|emb|Z99903.1|SPZ99903 [2766132] 
gi|2766130|emb|Z99902.I|SPZ99902 [2766130] 
gi|2766128|emb|Z99901 . 1 (SPZ99901 [2766 1 28] 
gi|2766126|emb|Z99900.1|SPZ99900 [2766126] 
gi|2766124|emb|Z99899.1|SPZ99899 [2766124] 
gi|2766122|emb|Z99898. 1 |SPZ99898 [27661 22] 
gi|2766120|emb|Z99897. 1 |SPZ99897 [27661 20] 
gi|2766H8|emb|Z99896.1|SPZ99896 [2766118] 



gi|27661 16|emb|Z99895.1|SPZ99895 [27661 16] 
gi|27661 14|cmb|Z99894.1|SPZ99894 [27661 14] 
gi|27661 12|emb|Z99893.1|SPZ99893 [27661 12] 
gi|27661 10|emb|Z99892.1|SPZ99892 [27661 10] 
gi|2766 1 08|emb|Z9989 1 . 1 |SPZ9989 1 [2766 1 08] 
gi|2766106|emb|Z99890.1|SPZ99890 [2766106] 
gi|2766104|emb|Z99889. 1|SPZ99889 [2766104] 
gi|2766102|emb|Z99888.1|SPZ99888 [2766102] 
gi|2766100|emb|Z99887.1|SPZ99887 [2766100] 
gi|2766098|emb|Z99886. 1|SPZ99886 [2766098] 
gi|2766096|emb|Z99885. 1|SPZ99885 [2766096] 
gi|2766094|emb|Z99884.1|SPZ99884 [2766094] 
gi|2766092|cmb|Z99883. 1|SPZ99883 [2766092] 
gi|2766090|emblZ99882.1|SPZ99882 [2766090] 
gi|2766088|emb|Z9988 1 . 1 |SPZ9988 1 [2766088] 
gi|2766086|emb|Z99880. 1 |SPZ99880 [2766086] 
gi|2766084|emb|Z99879. 1 (SPZ99879 [2766084] 
gi|2766082|cmb|Z99878. 1|SPZ99878 [2766082] 
gi|2766080|emb|Z99877.1|SPZ99877 [2766080] 
gi|2766078|emb|Z99876. 1 |SPZ99876 [2766078] 
gi|2766076|emb|Z99875. 1 |SPZ99875 [2766076] 
gi|2766074|emb|Z99874. 1 [SPZ99874 [2766074] 
gi|2766072|erob|Z99873. 1 [SPZ99873 [2766072] 
gi|2766070|emb|Z99872. 1 |SPZ99872 [2766070] 
gi|2766068|emb|Z9987 1 . 1 |SPZ9987 1 [2766068] 
gi|2766066|emb|Z99870. 1 |SPZ99870 [2766066] 
gi|2766064|emb|Z99869.1|SPZ99869 [2766064] 
gi|2766062|emb|Z99868.1|SPZ99868 [2766062] 
gi|2766060|emb|Z99867.11SPZ99867 [2766060] 
gi|2766058|emb|Z99866. 1 |SPZ99866 [2766058] 
gi|2766056|emb|Z99865. 1 |SPZ99865 [2766056] 
gi|2766054|emb|Z99864. 1 |SPZ99864 [2766054] 
gi|2765906|emb|Z99206. 1 |SPZ99206 [2765906] 
gi|2765904|emb|Z99205. 1 [SPZ99205 [2765904] 
gi|2765902|emb|Z99204. 1 |SPZ99204 [2765902] 
gi|2765900|erab)Z99203.1|SPZ99203 [27659.00} " 
gi|2765898|emb|Z99202.1|SPZ99202 [2765898] 
gi|2765896|emb|Z9920 1 . 1 |SPZ9920 1 [2765896] 
gi|2765894|emb|Z99200. 1 |SPZ99200 [2765894] 
gi|270863 1 |gb| AF03695 1 . 1 1 AF03695 1 [270863 1 ] 
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gi|886956|emb|Z49097.1|SPCSl 1 12X [886956] 

gi|2656093|gb|L2 1 856. 1 [STRMALR [2656093] 

gi|2576332|emb|AJ002055.1|SPSPSA47 [2576332] 

gi|2576330|emb|AJ002054.1|SPSPSA2 [2576330] 

gi|25U 704|cmb|Y10818.1|SPY10818 [2511704] 

gi|1944619|cmb|Z83335.1|SPZ83335 [1944619] 

gi|2425108|gb|AF019904.1|AF019904 [2425108] 

gi|2385404|emb|AJ00 1 246. 1 |SP7465RR5 
[2385404] 

gi(438213|emb|Z16082.1|PNALIB [438213] 

gi|2 1 496 1 3 |gb(U9072 1 . 1 |SPU9072 1 [2149613] 

gi|49391|emb|Z2184U|SPPBP2BB [49391] 

gi|2209207|gb|AF004325. 1|AF004325 [2209207] 

gi|2293061|emb|Z95914. 1|SPZ95914 [2293061] 

gi|2276393|gb|U16156.1|SPU16156 [2276393] 

gi|2183314|gb|AF003930.1|AF003930 [2183314] 

gi|2182093|emb|X95717.1|SPPARECGN 
[2182093] 

gi|984230|emb|Z49095.1|SPCSl 1 1 1A [984230] 

gi|886954|emb|Z49096. 1 |SPCS 1092X [886954] 

gi|1181613|dbj|D82873.1|STRPBP2BE [1181613] 

gi|l 1 8 1 61 2|dbj|D82871 . 1 (STRPBP2BCZ 
[1181612] 

gi|l 18161 l|dbj|D82870.l|STRPBP2BB2 [1181611] 

gi|1181579|dbj|D82869.1|SnU>BP2BAl [1181579] 

gi|1181 192|dbj|D82872.1|STRPBP2BD [1 181192] 

gi|575595|dbj|D42075.1|STRPBP2B2 [575595] 

gi| 1 33997 1 |dbj|D42074. 1 (STRPBP2B 1 [ 1 33997 1 ] 

gi|2108329|emb|Y11463.1|SPDNAGCPO 
[2108329] 

gi|1944l 15|dbj|AB002522.1|AB002522 [19441 15] 
gi| 1 666669|emb|Z77727. 1 |SPIS 1 38 1 C [ 1 666669] 
gi| 1 666668|emb|Z77726. 1 |SPIS 1 38 1 B [ 1 666668] 
gi|1666667|emb|Z77725.1|SPIS1381A [1666667] 
gi| 1 9 1 4873|emb|Z82002. 1 |SPZ82002 [ 1 9 14873] 
gi| 1 43 1 5 84|emb|Z74778. 1 (SPDHFR [ 1 43 1 5 84] 
gi|47452|emb|Z15120.1|SPSTRG [47452] 
gi|581717|emb|Z12159.1|SPCP131G [581717] 
gi|47342|erab|Xl 7337. 1 [SPAMILOC [47342] 
gi| 1 800300|gb|U83667. 1 |SPU83667 [ 1 800300] 
gi| 1 532066|emb| Y07780. 1 |SPTET0GEN [ 1 532066] 



gijl 161269Igb|L39074.1|STRSPXB [1 161269] 

gi| 1460093|emb|X94909. 1|SPIGA1PRT [1460093] 

gi|1750263|gb|U72720.1|SPU72720 [1750263] 

gi|298649|gb|S56948.1|S56948 [298649] 

gi|254537|gb|S435 1 1 . 1 |S435 1 1 [254537] 

gi|245227|gb|S8 105 1 . 1 |S8 105 1 [245227] 

gi|245226|gb|S8 1 045 . 1 |S8 1 045 [245226] 

gi|245225|gb|S8 1043. 1 |S81 043 [245225] 

gi|l 150618|emb|Z49988.1|SPMMSAGEN 
[1150618] 

gi|47456|emb|X01 138.1|SPTN917A [47456] 

gi|1658316|emb|Z47210.1|SPDEXCAP [1658316] 

gi|1550802|emb|X95385.1|SPCOMCGEN 
[1550802] 

gi|47457|emb|X01 137. 1|SPTN917B [47457] 

gi|9757 14|emb|X9094 1 . 1 [SPTRJ525 1 [975714] 

gi|9757 1 3|emb|X90940. 1 |SPTLJ525 1 [9757 13] 

gi|975709|emb|X90939. 1 |SPDNATETM [975709] 

gi|1524346|emb|Z79691. l|SOORFS [1524346] 

gi|1553054|emb|X98364.1|SPPBPHU9 [1553054] 

gi|1553052|cmb|X98367.1|SPPBPHU13 [1553052] 

gi|1553050|emb|X98366.1|SPPBPHU12 [1553050] 

gi|1553048[cmb|X98365.1|SPPBPHUl 1 [1553048] 

gi|1575029|gb|U53509.1|SPU53509 [1575029] 

gi| 1 542968|gb|U49088. 1 |SPU49088 [ 1 542968] 

gi| 1 542966|gb|U49087. 1 |SPU49087 [ 1 542966] 

gi| 1 536961 |emb| Y07845. 1 (SPGYRA [ 1 53696 1 ] 

gi|47391|emb|X16367.1|SPPBPX [47391] 

gi| 1490398|emb|Z67739. 1 [SPPARCETP [1490398] 

gi|1490395|emb|Z67740.1|SPGYRBORF 
[1490395] 

gi| 143 1 589|emb|Z74777. 1 [SPTMRDHFR 
[1431589] 

gi|408 1 45|emb|Z2 1 702. 1 |SPUNGMUTX [408 145] 
gi|47461|emb|X61025.1|SPXISINT [47461] 
gi|47459|cmb|X5565 1 . 1 |SPUNGG [47459] 
gi|47454|emb|X52632.1|SPT1545E [47454]^ , , 
gi|47421|emb|Z17307.1|SPRECA [4J42Tf 
gi|47419|emb|X67873. 1 |SPPONA8 [474 1 9] 
gi|474 1 7|cmb|X67872. 1 |SPPONA7 [4741 7] 
gi|474 1 5|emb|X6787 1 . 1 (SPPONA6 [474 1 5] 
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gi|474 1 3|emb|X67870. 1 |SPPONA5 [474 1 3] 
gi|4741 l|emb|X67869.1|SPPONA4 [47411] 
gi|47409|emb|X67867. 1 |SPPONA2 [47409] 
gi|47407|emb|X67866. 1 |SPP0NA1 [47407] 
gi|47405|emb|X67868. 1 |SPPNA3 [47405] 
gi|47403|emb|X52474. 1 |SPPLY [47403] 
gi|984232|emb|X16022.1|SPPENA [984232] 
gi|517190|emb|X78215.1|SPPBPXG [517190] 
gi|295840|emb|Z22230. 1 |SPPBP2BB A [295840] 
gi|288981|emb|Z22 1 85. 1|SPPBP2BAC [288981] 
gi|288979|emb|Z22 1 84. 1 |SPPBP2BAB [288979] 
gi|288466|emb|Z21981.1|SPPBP2BAA [288466] 
gi|49390|emb|Z21813.1|SPPBP2XD [49390] 
gj|49389|emb|Z21812.1|SPPBP2XC [49389] 
gi|49387|emb|Z2181 1.1|SPPBP2BJ [49387] 
gi|49385|emb|Z2 1 8 1 0. 1 |SPPBP2BI [49385] 
gi|49382|emb|Z2 1808. 1 |SPPBP2BH [49382] 
gi|49380|emb|Z2 1 807. 1 |SPPBP2BG [49380] 
gi|49379|emb|Z2 1806. 1 |SPPBP2BF [49379] 
gi|49377|cmb|Z2 1 805. 1 |SPPBP2BE [49377] 
gi|49376|emb|Z2 1 804. 1 |SPPBP2XB [49376] 
gi|49375|emb|Z2 1 803. 1 |SPPBP2XA [49375] 
gi|49374|emb|Z2 1 802. 1 |SPPBP2BD [49374] 
gi|49372|emb|Z21801 .1 |SPPBP2BC [49372] 
gi|49369|emb|Z21799.1|SPPBP2BA [49369] 
gi|47399|emb|Xl 3 1 37. 1 |SPPENASE [47399] 
gi|47397|emb|X13 136. 1 |SPPENARE [47397] 
gi| 1 052802|emb|X8391 7. 1 |SPGYRBG [ 1 052802] 
gi|587550|cmb|X72967. 1 |SPNANA [587550] 
gi|49384|emb|Z2 1 809. 1 |SPPBP1 AB [49384] 
gi|49371|emb|Z21800.1|SPPBPlAA [49371] 
gi|984228|emb|Z49094.1|SPCS1091A [984228] 
gi|47372|emb|X54225.1|SPENDA [47372] 
gi|806590|emb|Z49246. 1 |SP667SOD [806590] 
gi|407 1 72|emb|Z2685 1 . 1 |SPATPAS2 [407 172] 
gi|407166|emb|Z26850.1|SPATPASl [407166] 
gi|47353|emb|X63602. 1 |SPBOX [47353] 
gi|47348|emb|X05577. 1 |SPAPHA3 [47348] 
gi|47337|emb|X65 1 32. 1 |SP824PBPX [47337] 
gi|47335|emb|X65 1 34. 1 |SP669PBPX [47335] 



gi[4733 1 |emb|X65 133. 1 |SP577PBPX [4733 1] 
gi|559527|emb|X65136.1|SPl 10PBPX [559527] 
gi|311415|emb|Z22807.1|SP16SRNAA [311415] 
gi|47329|cmb|X65 135. 1 |SP53 1PBPX [47329] 
gi|47307|emb|X65 131.1 |SP290PBPX [47307] 
gi|47295|emb|X583 1 2. 1 |SP 1 6SRNA [47295] 
gi|854614|emb|Z49 1 09. 1 |SPGADAGN [854614] 
gi|556428|gb|L36660.1|STRORFl [556428] 
gi|51 1062|emb|Z35135.1|SPALIAG [51 1062] 
gi|1208737|gb|U47625.1|SPU47625 [1208737] 
gi|530062|gb|U12567. 1 |SPU12567 [530062] 
gi|153656|gb|M29686.1|STRHEXB [153656] 
gi|153654|gb|M18729.1|STRHEXA [153654] 
gi|153608|gb|M14339.1|STRDPN2A [153608] 
gi|153605|gb|M14340.1|STRDPNlA [153605] 
gi|643543|gb|U20084. 1 |SPU20084 [643543] 
gi|64354 1 |gb|U20083. 1 |SPU20083 [64354 1 ] 
gi|643539|gb|U20082. 1 |SPU20082 [643539] 
gi|643537|gb|U20081 .1 ISPU20081 [643537] 
gi|643535|gb|U20080. 1 |SPU20080 [643535] 
gi|643533|gb|U20079. 1 |SPU20079 [643533] 
gi|643531|gb|U20078.1|SPU20078 [643531] 
gi|643529|gb|U20077. 1 |SPU20077 [643529] 
gi|643527|gb|U20O76. 1 |SPU20076 [643527] 
gi|643525|gb|U20075.1 |SPU20075 [643525] 
gi|643523|gb|U20074. 1 |SPU20074 [643523] 
gi|643521|gb|U20073.1|SPU20073 [643521] 
gi|6435 19|gb|U20072. 1 |SPU20072 [643519] 
gi|643517|gb|U20071.1|SPU20071 [6435171 
gi|6435 1 5|gb|U20070. 1 |SPU20070 [6435 1 5] 
gi|6435 1 3|gb|U20069. 1 |SPU20069 [6435 13 ] 
gi|6435 1 1 |gb|U20068. 1 |SPU20068 [6435 1 1 ] 
gi|643509|gb|U20067. 1 |SPU20067 [643509] 
gi|1017802|gb|U37560.1|SPU37560 [1017802] 
gi|663277|gb|M361 80. 1 |STRCOMAA [663277] 
gi|437704|gb|L20670. 1 |STRHYALURO [437704] 
gi|153849|gbiL07751.1|TRNTN5252RTf53849] 
gi| 153855|gb|M255 19. 1 |STRVA1 [ 153855] 
gi| 153853|gb|M802 15.1 |STRUVS402A [153853] 
gi|153848|gb|L07750. 1|STRTN5252L [153848] 
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gi| 1 53840|gb|M74 1 22. 1 |STRSURPROA [ 1 53840] 
gi| 1 53796|gb|M60763 . 1 (STRRRNAA [ 1 5 3796] 
gi|153791|gb|M31296.1|STRRECP [153791] 
gi|5 16639|gb|L20556. 1 |STRPLPA [5 1 6639] 
gi|153783|gb|M28679.1|STRPROMB [153783] 
gi| 1 53782|gb|M2867 8. 1 |STRPROMA [ 1 53782] 
gi|153766|gb|M90527.1|STRPONA [153766] 
gi|153764|gb|J04479.1|STRPOLA [153764] 
gi|153752|gb|M25515.1|STRNG4369 [153752] 
gi|153722|gb|L0861 1.1ISTRMLTODX [153722] 
gi|153702|gb|J01796.1|STRMALMXP [153702] 
gi|153701|gb|J01795.1|STRMALMX [153701] 
gi|153693|gb|M13812.1|STRLYTPN [153693] 
gi| 1 5369 1 |gb|M 1 77 1 7. 1 (STRLYS [ 1 5369 1 ] 
gi|153667|gb|M25525.1|STRKAG73 [153667] 
gi|398102|gb|L20564.1|STREXP9B [398102] 
gi|398100|gb|L20563.1|STREXP9A [398100] 
gi|398098|gb|L20562. 1 (STREXP8A [398098] 
gi|398096|gb|L20561 . 1 |STREXP7A [398096] 
gi|398094|gb|L20560. 1 |STREXP6 A [398094] 
gi|398092|gb|L20559. 1 |STREXP5A [398092] 
gi|398090|gb|L20558. 1 |STREXP4 A [398090] 
gi|153626|gb|J04234.1|STREXOA [153626] 
gi|153612|gb|M11226.1|STRDPNM [153612] 
gi| 1 53603|gb|M2552 1 . 1 |STRDN87669 [ 1 53603] 
gi| 1 5360 1 |gb|M25526. 1 (STRDN87577 [ 1 5360 1 ] 
gi|153599|gb|M25522.1|STRDN179 [153599] 
gi|153594|gb|M37688.1|STRDACA [153594] 
gi|153582|gb|L07752.1|STRATTB [153582] 
gi|4665 14|gb|L3 14 13. 1 (STR1RRA [4665 14] 
gi| 1 5355 1 |gb|M25520. 1 |STR8249 [ 1 5355 1 ] 
gi| 1 53549|gb|M25524. 1 |STR53 1 3972 [ 1 53549] 
gi| 1 53547|gb|M255 1 7. 1 |STR29044 [ 1 53547] 
gi| 1 53545|gb|M25523. 1 |STR1 8 1 07 1 [ 1 53545] 
gi| 1 5354 1 |gb|M255 18.1 |STR 1 2 1 [ 1 5354 1 ] 
gi| 1 53539|gb|M255 1 6. 1 |STR1 1 0K70 [ 1 53 539] 
gi|506632|gb|U04047. 1 |SPU04047 [506632] 
gi|393267|gb|L19055.1|STRPAPA [393267] 
gi(442066|gb|S62272.1|S62272 [442066] 
gi|295 191 |gb|L 1 5 190. 1 |STRPURISYN [295 191] 
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CLAIMS 

What is claimed is: 

5 1 . A method for identifying a bacteriophage coding region encoding a 

product active on an essential bacterial target, comprising identifying a nucleic acid 
sequence encoding a gene product which provides a bacteria-inhibiting function when 
said bacteriophage infects a host bacterium, 

wherein said bacteriophage is uncharacterized and said host bacterium 

10 is a pathogenic bacterium. 

2. The method of claim 1 , further comprising expressing a recombinant 
bacteriophage ORP in cells of a bacterial strain, wherein inhibition of said cells 
following expression of said ORF is indicative that said product is active on an 

1 5 essential bacterial target. 

3. The method of claim 2, wherein inhibition of said bacterium following 
expression of said ORF is determined by comparison with the growth or viability of 
said bacterium following expression of an inactivated mutant form of said ORF or in 

20 the absence of expression of said ORF, and wherein inhibition of said bacterium 
following expression of said ORF is indicative that said product is active on an 
essential bacterial target. 

4. The method of claim 2, wherein expression of said ORF is inducible. 

25 

5. The method of claim 1 , further comprising sequencing at least a 
portion of a bacteriophage genome. 

6. The method of claim 1, wherein at least a portion of the nucleotide 
30 sequence of a bacteriophage genome is known, said method further comprising 

identifying at least one ORF in said portion by computer analysis of said sequence. 

7. The method of claim 6, further comprising analyzing the sequence of 
said at least one ORF or of a polypeptide encoded by said ORF to identify 

35 homologous genes or gene products of known biochemical function, thereby- 
indicating the biochemical function of said polypeptide. 
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8. The method of claim 7, wherein said homologous gene or gene product 
is a bacterial gene important for cell viability. 

9. The method of claim 7, wherein said homologous gene or gene product 
5 is a gene or gene product known to have a bacteria-inhibiting function. 

10. The method of claim 6, further comprising analyzing the sequence of 
said at least one ORF or of a polypeptide encoded by said ORF to identify structural 
motifs in said polypeptide, thereby indicating the cellular function of said polypeptide. 

10 

1 1 . The method of claim 1 , wherein a host bacterium for said 
bacteriophage is selected from the species group consisting of bacteria listed in Table 
1. 



15 12. The method of claim 1, wherein said bacteriophage is selected from the 

group consisting of uncharacterized bacteriophage listed in Table 1. 



13. The method of claim 2, wherein a plurality of bacteriophage ORFs are 
expressed in at least one bacterium. 

20 

14. The method of claim 13, wherein each of said plurality of 
bacteriophage ORFs is expressed in a different bacterium. 



15. The method of claim 14, wherein said plurality of bacteriophage ORFs 
25 comprises at least 10% of the ORFs in the genome of said bacteriophage. 

16. The method of claim 1, wherein said pathogenic bacterium is an animal 
pathogen. 

30 17. The method of claim 16, wherein said pathogenic bacterium is a human 

pathogen. 

18. The method of claim 1 , wherein said pathogenic bacterium is a plant 
pathogen. _ 

35 - ~ ' 

19. The method of claim 1, further comprising confirming the inhibitor 
function of said ORF. 
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20. The method of claim 1 9, wherein said confirming comprises 
expressing a loss-of-function mutant form of said ORF in said host bacterium. 

5 2 1 . The method of claim 1 , wherein said identifying a nucleic acid 

sequence encoding a gene product active on an essential bacterial target comprises 
identifying a nucleic acid sequence encoding a homolog of a bacteriophage 
polypeptide known to be active on an essential bacterial target. 

1 0 22. The method of claim 1 , wherein said identifying a bacteriophage 

coding region comprises identifying a first coding region from a bacteriophage having 
a non-pathogenic host bacterial strain related to said pathogenic bacterium, said first 
coding region encoding a product active on an essential bacterial target; and 
identifying a homolog of said first coding region, wherein said 

1 5 homolog is a probable said bacteriophage coding region encoding a product active on 
an essential bacterial target. 

23 . The method of claim 2, wherein a plurality of bacteriophage ORFs 
from a plurality of different bacteriophage are expressed in at least one bacterium. 

20 

24. The method of claim 23, wherein each of said plurality of 
bacteriophage ORFs are expressed in different bacteria. 



25 25. A method for identifying a target for antibacterial agents, comprising 

determining the bacterial target of an uncharacterized bacteriophage inhibitor protein. 

26. The method of claim 25, wherein said determining comprises 
identifying at least one bacterial protein which binds to said bacteriophage inhibitor 

30 protein or a fragment thereof. 

27. The method of claim 26, wherein said binding is determined using 
affinity chromatography on a solid matrix. 

35 28. The method of claim 25, wherein said determining comprises 

identifying at least one proteinrprotein interaction using a genetic screen. 



Patent provided by Sughrue Mion, PLLC - http://www.sughrue.com 



WO 00/32825 



PCT/IB99/02040 



433 

29. The method of claim 28, wherein said genetic screen is a yeast two- 
hybrid screen. 

30. The method of claim 25, wherein said determining comprises a co- 
5 immunoprecipitation assay or a protein-protein crosslinking assay. 

3 1 . The method of claim 25, wherein said determining comprises 
identifying a mutated bacterial coding sequence which protects a bacterium from said 
bacteriophage inhibitor. 

10 

32. The method of claim 25, wherein said determining comprises 
identifying a bacterial coding sequence which protects a bacterium against said 
bacteriophage inhibitor when expressed at high levels in said bacterium. 

15 33. The method of claim 25, wherein said determining further comprises 

identifying a bacterial nucleic acid sequence encoding a polypeptide target of said 
bacteriophage inhibitor protein. 

34. The method of claim 33, wherein said nucleic acid sequence is 

20 identified by determining at least a portion of the amino acid sequence of a bacterial 
protein target, and identifying a bacterial nucleic acid sequence which encodes said 
protein target. 

35. The method of claim 25, wherein said bacterial target is naturally 
25 produced by a bacterial species selected from the group consisting of species of the 

genera listed in Table 1 . 

36. The method of claim 25, wherein said bacterial target is naturally 
produced by a bacterial strain selected from the group consisting of species listed in 

30 Table 1. 

37. The method of claim 25, wherein said inhibitor protein is naturally 
produced by a bacteriophage selected from the group consisting of uncharacterized 
bacteriophage listed in Table 1. ... 

35 

38. The method of claim 25, further comprising identifying a 
bacteriophage ORF which encodes a product having a bacteria-inhibiting function. 
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39. The method of claim 3 8, wherein said identifying a phage ORF 
comprises expressing at least one bacteriophage ORF in a bacterium, wherein 
inhibition of said bacterium following said expression is indicative that said ORF 

5 encodes a bacteria-inhibiting function. 

40. The method of claim 39, wherein a plurality of bacteriophage ORFs are 
expressed in at least one bacterium. 

10 41. The method of claim 40, wherein each of said plurality of 

bacteriophage ORFs is expressed in a different bacterium. 

42. The method of claim 41 , wherein said plurality of bacteriophage ORFs 
comprises at least 10% of the ORFs in the genome of said bacteriophage. 

15 

43. The method of claim 25, wherein said determining the bacterial target 
of a bacteriophage inhibitor protein is performed for a plurality of different 
bacteriophage of the same host bacterium. 

20 44. The method of claim 25, wherein said bacterial target originates from 

an animal pathogen. 

45. The method of claim 44, wherein said bacterial target is a gene 
homologous to a gene from an animal pathogen. 

25 

46. The method of claim 44, wherein said pathogen is a human pathogen. 

47. The method of claim 25, wherein said bacterial target originates from a 
plant pathogen. 

30 

48. The method of claim 25, wherein said bacterial target is a gene 
homologous to a gene from a plant pathogen. 

49. The method of claim 25, further comprising determining the cellularpr . 
35 biochemical function or both of said inhibitor protein. 
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50. The method of claim 25, wherein said identifying the bacterial target 
comprises identifying a phage-specific site of action. 

5 5 1 . An isolated, purified, or enriched nucleic acid sequence at least 1 5 

nucleotides in length, wherein said sequence corresponds to at least a portion of a 
bacteriophage sequence, and wherein said bacteriophage is selected from the group 
consisting of Staphylococcus aureus bacteriophage 77, 3A, 96, and 44AHJD, 
Enterococcus baceriophage 182, and Streptococcus pheumoniae bacteriophage Dp-1. 

10 

52. The nucleic acid sequence of claim 5 1 , wherein said sequence 
comprises at least 50 nucleotides. 

53. The nucleic acid sequence of claim 5 1 , wherein said nucleic acid 

15 sequence corresponds to at least a portion of a nucleic acid sequence which encodes a 
product which provides a bacteria-inhibiting function. 

54. The nucleic acid sequence of claim 53, wherein said nucleic acid 
sequence encodes a polypeptide which provides a bacteria-inhibiting function. 

20 

55. The nucleic acid sequence of claim 54, wherein said nucleic acid 
sequence is transcriptionally linked with regulatory sequences enabling induction of 
expression of said sequence. 

25 

56. An isolated, purified, or enriched polypeptide comprising at least a 
portion of a protein providing a bacteria-inhibiting function, wherein said polypeptide 
is normally encoded by a bacteriophage selected from the group consisting of 
Staphylococcus aureus bacteriophage 77, 3 A, 96, and 44AHJD, Enterococcus 

30 baceriophage 1 82, and Streptococcus pheumoniae bacteriophage Dp- 1 . 

57. The polypeptide of claim 56, wherein said polypeptide provides said 
bacteria-inhibiting function. 

35 58. The polypeptide of claim 56, wherein said polypeptide comprises a 

portion at least 10 amino acid residues in length of a said polypeptide normally 
encoded by said bacteriophage. 
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59. A recombinant vector comprising a bacteriophage ORF corresponding 
to an ORF from a bacteriophage having a pathogenic bacterial host, wherein said 

5 bacterial host is selected from the group consisting of uncharacterized bacteria of 
Table 1. 

60. The vector of claim 59, wherein said vector is an expression vector. 

10 61. The vector of claim 59, wherein said bacteriophage is selected from the 

group consisting of uncharacterized bacteriophage of Table 1. 

62. The vector of claim 61, wherein said bacteriophage is selected from the 
group consisting of Staphylococcus aureus bacteriophage 77, 3A, 96, and 44AHJD, 

1 5 Enterococcus baceriophage 1 82, and Streptococcus pheumoniae bacteriophage Dp- 1 . 

63. The vector of claim 60, wherein expression of said ORF is inducible. 



20 64. A recombinant cell comprising a vector, wherein said vector comprises 

an ORF from a bacteriophage having a pathogenic bacterial host, wherein said 
bacterial host is selected from the group consisting of bacterial species of Table 1. 

65. The recombinant cell of claim 64, wherein said bacteriophage is 
25 selected from the group consisting of uncharacterized phage of Table 1 . 

66. The cell of claim 65, wherein said bacteriophage is selected from the 
group consisting of Staphylococcus aureus bacteriophage 77, 3A, 96, and 44AHJD, 
Enterococcus baceriophage 182, and Streptococcus pheumoniae bacteriophage Dp-1 . 

30 

67. The cell of claim 64, wherein said vector is an expresssion vector and 
expression of said ORF is inducible. 



35 



68. A method for identifying an antibacterial agent, comprising identifying 
an active portion of a product of a bacteria-inhibiting ORF of a bacteriophage. 
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69. The method of claim 68, further comprising constructing a synthetic 
peptidomimetic molecule, wherein the structure of said molecule corresponds to the 
structure of said active portion. 

5 

70. A method for identifying a compound active on a target of a 
bacteriophage inhibitor protein, comprising the step of 

contacting a bacterial target protein with a test compound; and 
determining whether said compound binds to or reduces the level of 

1 0 activity of said target protein, 

wherein binding of said compound with said target protein or a 
reduction of the level of activity of said protein is indicative that said compound is 
active on said target and wherein said target is uncharacterized. 

The method of claim 70, wherein said contacting is carried out in vitro. 
The method of claim 70, wherein said contacting is carried out in vivo 

The method of claim 70, wherein said compound is a small molecule. 
The method of claim 70, wherein said compound is a peptidomimetic 

The method of claim 70, wherein said compound is a fragment of a 
bacteriophage inhibitor protein. 

76. The method of claim 70, further comprising determining the site of 
action of said compound on said target protein. 

30 

77. The method of claim70, wherein said contacting is performed for a 
plurality of said target proteins. 

35 78. A method of screening for potential antibacterial agents, comprising 

the step of determining whether any of a plurality of compounds is active on a target 
of a bacteriophage inhibitor protein, 
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wherein said target is naturally produced by a pathogenic bacterium. 

79. The method of claim 78, wherein said plurality of compounds are 
small molecules. 



5 



80. The method of claim 78, wherein said determining is performed for a 
plurality of said targets. 



10 81. A method for inhibiting a bacterium , comprising the step of; 

contacting said bacterium with a compound active on a target of a 
bacteriophage inhibitor protein, wherein said target or the target site is 
uncharacterized. 

1 5 82. The method of claim 8 1 , wherein said compound is said protein or an 

active fragment thereof. 

83. The method of claim 8 1 , wherein said compound is a structural 
mimetic of said protein. 

20 

84. The method of claim 81, wherein said compound is a small molecule. 

85. The method of claim 8 1 , wherein said contacting is performed in vitro. 

25 86. The method of claim 81, wherein said contacting is performed in vivo 

in an animal. 

87. The method of claim 86, wherein said animal is a human. 

30 88. The method of claim 8 1 , wherein said contacting is carried out in vivo 

in a plant. 

89. The method of claim 8 1 , wherein said bacterium is selected from the 
group of bacteria listed in Table 1. 

35 
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90. A method for treating a bacterial infection in an animal suffering from 
an infection, comprising administering to said animal a therapeutically effective 
amount of compound active on a target of a bacteriophage inhibitor protein in a 
bacterium involved in said infection, 

5 wherein said target is an uncharacterized target or the compound is active at an 

uncharacterized target site. 

91 . The method of claim 90, wherein said compound is a small molecule. 

1 0 92. The method of claim 90, wherein said compound is a peptidomimetic 

compound. 

93. The method of claim 90, wherein said compound is a fragment of a 
bacteriophage inhibitor protein. 

15 

94. The method of claim 90, wherein said animal is a mammal. 

95. The method of claim 94, wherein said mammal is a human. 

20 96. The method of claim 90, wherein said bacterium is selected from the 

group listed in Table 1. 

97. The method of claim 90, wherein said bacteriophage inhibitor protein 
is from a bacteriophage selected from the group of bacteriophage listed in Table 1. 

25 

98. A method for propylactically treating an animal at risk of an infection, 
comprising administering to said animal a prophylactically effective amount of a 
compound active on a target of a bacteriophage inhibitor protein, 

30 wherein said target is an uncharacterized target or the site of action of 

said compound is an uncharacterized target site. 

99. The method of claim 98, wherein said compound is a small molecule. 

35 100. The method of claim 98, wherein said compound is a peptidomimetic 

compound. 
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101 . The method of claim 98, wherein said compound is a fragment of a 
bacteriophage inhibitor protein. 

102. The method of claim 98, wherein said animal is a mammal. 

103. The method of claim 102, wherein said mammal is a human. 



104. An antibacterial agent active on a target of a bacteriophage inhibitor 
1 0 protein, wherein said target is an uncharacterized target or said agent is active at a 

phage-specific site on said target. 

1 05. The agent of claim 1 04, wherein said agent is a pepetidomimetic of a 
bacteriophage inhibitor polypeptide. 

15 

106. The agent of claim 104, wherein said agent is a small molecule. 

107. The agent of claim 104, wherein said agent is a fragment of a 
bacteriophage inhibitor polypeptide. 

20 

108. The agent of claim 104, wherein said agent is active at a phage-specific 
site on said target. 



25 1 09. A method of making an antibacterial agent, comprising the steps of: 

a) identifying a target of a bacteriophage inhibitor polypeptide; 

b) screening a plurality of test compounds to identify a compound 
active on said target; and 

c) synthesizing said compound in an amount sufficient to provide a 
30 therapeutic effect when administered to an organism infected by a bacterium naturally 

producing said target. 

1 10. The method of claim 109, wherein said compound is a small molecule. 

- .. •»« 

35 111. The method of claim 1 09, wherein said compound is a peptidommietic 

compound. 
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1 12. The method of claim 109, wherein said compound is a fragment or 
derivative of a bacteriophage inhibitor protein. 

5 1 13. A computer readable device having recorded therein a nucleotide 

sequence of a portion of at least one bacteriophage genome of Staphylococcus aureus 
bacteriophage 77, bacteriophage 3 A, or bacteriophage 96, a nucleotide sequence at 
least 95% identical to a said nucleotide sequence, a ribonucleic acid equivalent, a 
degenerate equivalent, a homologous sequence, or at least one amino acid sequence 
1 0 encoded by said nucleotide sequence; and 

a nucleotide sequence or amino acid sequence analysis program, 
wherein said program can perform at least one sequence analysis on said 
nucleotide or amino acid sequence. 

15 114. The device of claim 1 1 3, wherein said at least a portion of at least one 

bacteriophage genome comprises at least one ORF. 

1 15. The device of claim 1 13, wherein said device comprises a medium 
selected from the group consisting of floppy disk, computer hard drive, optical disk, 

20 computer random access memory, and magnetic tape wherein said nucleotide or 
amino acid sequence or said program or both are recorded on said medium. 

1 16. The device of claim 1 13, wherein said portion of at least one 
bacteriophage genomic nucleotide sequence comprises at least 50% of at least one 

25 bacteriophage genomic sequence. 

1 1 7. The device of claim 1 1 3, wherein said at least one bacteriophage 
nucleotide genomic sequence comprises portions of a plurality of bacteriophage 
nucleotide genomic sequences. 

30 

118. A computer-based system for identifying biologically important 
portions of a bacteriophage genome, comprising: 

a) a data storage medium having recorded thereon a nucleotide sequence 
35 corresponding to a portion of at least one bacteriophage genome, wherein said 
bacteriophage genome is uncharacterized; 
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b) a set of instructions allowing searching of said sequence to analyze, said 
sequence; and 

c) an output device. 

5 119. The system of claim 1 1 8, wherein said output device comprises 

comprises a device selected from the group consisting of a printer, a video display, 
and a recording medium. 

120. The system of claim 1 1 8, wherein said bacteriophage genome is of a 
1 0 bacteriophage selected from the group consisting of uncharacterized bacteriophage 

listed in Table 1. 

121. The system of claim 118, wherein said uncharacterized bacteriophage 
is selected from the group consisting of bacteriophage 77, 3 A, and 96. 

15 

1 22. A method for identifying or characterizing a bacteriophage ORF, 
comprising the steps of: 

a) providing a computer-based system for analyzing nucleic acid or 
20 amino acid sequence data, wherein said system comprises a data storage medium 

having recorded thereon at least one nucleotide or amino acid sequence corresponding 
to a portion of at least one uncharacterized bacteriophage genome, a set of instructions 
allowing searching of said sequence to analyze said sequence; and an output device; 

b) analyzing at least a portion of at least one said sequence; and 
25 c) outputting results of said analyzing to said output device. 

123. The method of claim 122, wherein said analysis identifies sequence 
similarity or homology with sequences selected from the group consisting of bacterial 
ORFs encoding products with related biological function; ORFs encoding known 

30 inhibitors or bacteria, essential bacterial ORFs. 

124. The method of claim 122, wherein said analysis comprises identifying 
a probable biological function based on identification of structural elements or 
sequence homology or similarity. "-.„•■.■ * 

35 

125. The method of claim 122, wherein said bacteriophage is selected from 
the group consisting of uncharacterized bacteriophage listed in Table 1. 



Patent provided by Sughrue Mion, PLLC - http://www.sughrue.com 



WO 00/32825 



PCT/IB99/02040 



443 

126. The method of claim 125, wherein said uncharacterized bacteriophage 
is selected from bacteriophage 77, 3A, and 96. 
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FIG. IB. 




PCR of pT002l with XhoF i BomHNR 



Xhof • 5'-AATT CTCGAGT AAAATAACAT4' 
Xhol 

Hind III 



AAATCAGGTGACTGTTGAGAAAAGGAGGCGGATCCCG'BamHNR 
Stop of RBS BamHI 
arsR 



Digestion with 
Xhol &BamH\ 



Modified between stop 
of arsR to BamHI 




Ligation 



PCR of pT0021 with LucFFB & LucFFH 



LucFFB -5'-CG GGA TCCATG AGG GGTTCCG AAG ACG 
Start of Original BamHI 
LucFF was modified 



BamHI 



5 



GAMGTCCAAATTGTAAGC77GGG-LucFFH 



Stop of 
LucFF 



Hind\\\ 



Digestion with 
BamHI & Hindlll 



Ligation 



Modified in the 
vicinity of BamHI 

Cloning site for ORFs: 
BamHI & Hindlll 

No additional codons 
in the induced protein 



P 




Hindlll 



CTCGAG- 
Xhci 



arsR LucFF 

(TGAlGAAAAGGAGGCGGATCCfATGi 

RBS BamHI HindlW 
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Bo mHI Sail 
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ATG 
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last codon 
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Hindlll 



Digestion with 
BamH\ & Sa/I 



Digestion with 
BamH\ & Sa/I 



Ligation 
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FIG. 3. 

(A) Functional assay on semi-solid support media 

Frozen stock of phage 77 pTHA/ORF S. aureus RN4220 transformants 

1:10 and 1:100 dilution in saline solution 
5 |il of 1:10 diluti^ "^^^^of 1:10 and 1:100 dilution 

Streak onto agar plates containing Spot onto agar plates containing 

0, 2.5, 5, and 7.5 [iM NaAs02 0, 2.5, 5, and 7.5 \xM NaAs02 

O/N, 37'C 

Compare bacterial growth on plates with and without NaAs0 2 




(B) Functional assay in liquid medium 

O/N culture inoculated from frozen stock of 

phage 77 pTHA/ORF S. aureus RN4220 transformants 

j 

1:100 dilution of O/N culture 

j2h.37*C f 250rpm 

Fresh culture 
|150 |xl 

2.5 ml containing 0 and 5 p.M NaAs0 2 

J 3.5 h, 37'C, 250 rpm 




Measure OD 565 1:10 serial dilution from 10* 1 to 10" 6 

. | 20p.l of 10* 4 _to 10*^ 

Spot onto agar plate 
| O/N, 3rc 

Count colonies 
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FIG. 7. 

Abbreviations: 

kan: gene encoding kanamycin resistance 

cat: gene encoding chloramphenicol resistance 

ori + and origin of replication in gram-positive and 

gram-negative bacteria, respectively 

arsR: gene encoding regulatory protein of the ars promoter 

P: ars promoter 

lucFF: gene encoding luciferase protein. This portion will 
be removed and replaced by individual S. aureus phage 
genes. 9 

Referance: 

Tauriainen et al., Appl. Environ. Microbio. 1997. 63' 4456- 
4461 
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